Medalta Museum Digital Preservation Plan

The following document is a copy of Medalta Museum’s Digital Preservation Plan, a working document which was current at the time of publication, February 2016. This plan was produced using the Canadian Heritage Information Network’s (CHIN) Digital Preservation Plan Framework for Museums and has been modified for web accessibility. It is published as part of CHIN’s case study of digital preservation work at Medalta Museum and is one of many digital preservation resources available in CHIN’s Digital Preservation Toolkit.

Table of Contents

Reference to Digital Preservation Policy

This document was developed in accordance with the Medalta Digital Preservation Policy Document.

As per the policy:

This plan is to support Medalta Museum’s operations by preserving and ensuring long-term access to digital artefacts and digital representations of the museum holdings and the holdings of the Medicine Hat Clay Industries National Historic District, and by preserving and ensuring access to any digital asset necessary for these operations.

Access to all of these digital assets are to be made available, on demand, to a limited number of museum staff, who may in turn decide to whom copies of the asset should be distributed.

Summary of Environment and Constraints

The following constraints are to be taken into account in the selection of any action plan.

Organizational Structure

Medalta Museum, a living museum celebrating the history of Medicine Hat’s brick, clay and pottery industry, is one of many operations which function as part of the Medicine Hat Clay Industries National Historic District. Approximately 17 full-time and 4 part-time casual employees work and move freely within the various operations under this organization. An additional 16 part-time casual staff are dedicated to special events. Operational decisions are made by staff, who are ultimately accountable to the Executive Director and General Manager of the Historic District. In turn, the Executive Director and General Manager is accountable to the organization’s Board of Directors.

Current Practices and Obligations

  • The museum is open year-round. Staff also support the operation of the Shaw International Centre for Contemporary Ceramics and its resident artists.
  • The museum receives some of its funding from the Medicine Hat municipality and has strong ties with the city by serving as a cultural anchor. The expansive buildings and grounds are often host to various cultural events attended by the town’s residents and are also made available for private functions.
  • Salaried staff run the museum on a daily basis — all of whom have training or experience in their respective fields of expertise.
  • Museum operations have been expanding over the past decade, and planning must take into account both current needs and future requirements. For instance, very little of the museum’s holdings have been digitized. However, funding is in place to digitize a significant amount, both through photography and 3D scanning.
  • Likewise, the museum has been using MS Access to manage collections records, but it is migrating this data to PastPerfect Online. This will likely increase demand on digital content, as it can be made available online.

Organizational Readiness

Museum staff and management are taking a leadership role in working with external organizations to prepare a digital preservation plan and policy where few currently exist in the Canadian museum environment. Both staff and management recognize the need for a digital preservation plan and are prepared to implement one that respects available resources.

Financial Constraints

Medalta receives funding from the municipality and from various private and public sector sources. Funding is currently in place for digitization projects, and as part of the sustainability of this work, a digital preservation plan needs to be put in place.

Simple plans that amount to backups with some preservation data will not be subject to financial constraints. More complex plans (such as the use of a trusted digital repository) will require additional funds to be earmarked as core (not project) funding.

Human Resources

The museum has salaried staff possessing various competencies in museum management and collections management. However, there are no onsite staff that are strictly dedicated to IT or technical support; instead, a contractor comes in as needed.

Technical Constraints

The majority of computing tasks for the Historic District are performed in the office at Medalta Museum; however, the buildings are spread across a very wide area. A vehicle is required to reach some locations. There is no private computer network between these buildings. The need for a network may be mitigated somewhat by migration to PastPerfect Online (allowing all machines access to the collections management database), but should this not be sufficient, management might consider the installation of a virtual private network (VPN), for instance, to allow printer and disk sharing across all building locations. For the present, implementation of a network is considered out-of-scope.

The upper limit of what could be digitized is extremely variable. Medalta’s management is adept at acquiring new assets, developing new business models and raising the funds to achieve these goals. There is also a very large existing inventory of artefacts and documents that have not been accessioned, and it is not yet clear how much of this merits digitization. Any digital preservation plan will have to be flexible to this unknown amount of digital content, without costing an undue amount (should the amount of content be minimal to begin with).

Office machines run on MS Windows, and all have high-speed Internet access.

The museum and all buildings associated with it are located in a floodplain.

Description of the Collection & Relevant Characteristics of Digital Objects

There are currently 10 digital asset groups identified for Medalta, some of which are not yet in digital format. As of the most recent revision of this document, all collections management records have been migrated to PastPerfect Online. A large number of operational documents also exist, both from the museum (hard and soft copies) and from the historic manufacturing operations (hard copies). Finally, there are more than 70,000 physical objects (molds, tools and similar artefacts) that may be accessioned and digitized in the near term.

See the completed Digital Preservation Inventory Template for details.

User Requirements

Users require a digital preservation solution that:

  • Addresses the greatest anticipated need: it is anticipated that access to digital assets will most often be through PastPerfect Online. However, other projects (web publishing, 3D rendering, etc.) involve content not available through PastPerfect. Any proposed solution must ensure that all digitized content is readily available to onsite machines.
  • Permits select museum staff to search the digital photos or 3D content using fields common to a collections management record.
  • Provides a short turnaround for queries: no access to preserved content should be hindered by the way in which it is stored. All content should be readily available to onsite staff.
  • Is secure: to ensure integrity of the digital archive, any writeable form (on a hard drive for instance) should have security measures to restrict access to authorized users.
  • Is simple to use: any protocols for digital archive ingest, management and access should not be beyond the capabilities of competent staff who do not have IT or archival backgrounds.
  • Is easily searchable: digital content should be easily found based on likely search criteria (for example, knowing the accession number of a physical object should lead directly to digitized images of that object).

Consideration of Possible Action Plans

The following options were considered:

Option 1 – Multiple Backups of Working Files (No Archival Software): Make Preservation Copies Using a Checksum Generator

This option uses external hard drives to make regular backups of digital assets as they exist (backup software is used, but not digital archiving software). The option also uses an MD5 checksum generator to produce annual preservation copies. All content is stored on hard drives, and drives are refreshed (replaced) every five years.

General steps for this option include:

Complete the migration to PastPerfect Online, and ensure that the preservation of this content rests with the host organization. If assurances cannot be obtained to the satisfaction of Medalta, establish a method of regularly receiving downloaded copies of the PastPerfect collections management records (in a tab-delimited or similar format). Intervals for receiving these downloaded copies will depend on the frequency with which records are updated. Store these copies in a secure location; the museum’s office is acceptable.

For all other digital assets:

  • Designate an office machine to host a shareable hard drive, and ensure all digital assets reside on this drive.
  • Clean up the physical media – move content on any existing CDs to the office machine shared drive.
  • Digitize any analog media (cassette, film, reel-to-reel, etc.) as time and budget permits. Keep working copies on the shared drive office machine.
  • For digitized images, audio and film, tie file names to accession numbers.
  • Ensure password protection at the operating system level for all machines.
  • Acquire two additional external (USB) hard drives (5 TB each) for backup and preservation copies.
  • On the first of these two drives, make regular weekly and monthly backups of all relevant directories for all media groups from the shared drive. Also keep annual preservation copies on this drive, using an MD5 checksum generator. Keep this drive connected to the working machine in the main office.
  • On the second of these two drives, maintain copies of the monthly backups, as well as copies of the annual preservation content. Keep this drive offsite, out of the floodplain.
  • All three drives (shared, external 1 and external 2) should never be in the same location at the same time.
  • Refresh media by replacing external hard drives every five years.
  • Update the Digital Asset Inventory annually.
  • Migrate any assets identified in the Digital Asset Inventory to newer file formats, as required.
  • The management of administrative documents is generally not part of a preservation plan for museum holdings, but these can also be copied to external hard drive(s) to ensure backup versions exist.
  • Download a free copy of the MD5summer checksum software, and use this for the creation of any preservation copies (done annually).
  • If available, use password protection on the external hard drives, and provide this password to select staff.

Option 1 Pos:

  • Addresses immediate need to move digital assets from non-archival CDs.
  • Simple solution – can be implemented without significant training.
  • Affordable – requires only the purchase of two external hard drives.
  • Provides better archival protection than simple backups.
  • Fixity of files can be easily assessed with MD5summer software.
  • Using a checksum is in keeping with archival standards.

Option 1 Cons:

  • Some basic training will be required.
  • Keeping all content on hard drives does not observe the 3-2-1 ruleFootnote 1.
  • No audit trail if preserved copies are changed.
  • Any chosen capacity for the archival hard drive may not be sufficient within its five-year lifespan. It is conceivable that Medalta may digitize more content than originally expected.

Option 2 – Use Disk Pools

For this solution, individual drives described in option 1 are replaced with expandable pools of hard drives across which multiple copies of the content is written, in the event of disk failure. Windows 10 Storage Spaces virtual drives are considered, as these were tested by CHIN to be a robust, easy-to-use system. Moreover, Windows 10 is free to upgrade for existing Windows 7 and 8 users.

The first disks to require expansion are likely to be external drives 1 and 2 (described in option 1), as each of these will require up to six times the storage space of the working (shared) drive. This solution can be applied to the working drive as well (not discussed further), as long as the working drive does not also hold operating system files.

Each external drive described in option 1 should be replaced with a single Storage Space (i.e. two Storage Spaces are required). Each Storage Space consists of a pool of hard drives that are connected to the working computer via a USB 3.0 hub with its own power source. Use one hub for each Storage Space.

Windows Storage Spaces offers various ways of ensuring content is not lost (i.e. “resiliency”) in the event of a drive failure. The most appropriate for archival purposes is using disk parity. In short, this form of resiliency will require longer write time to the drives (backups will be a bit slower than with other options), but it will maximize useable drive space. Three drives are required to start a parity Storage Spaces pool. If a single drive fails, the content remains on the other two drives, the system will inform users and the drive can easily be replaced. As additional space is required, Windows will inform users of the need to add additional disk drives, and this can be done quickly and easily without compromising existing data.

General steps for option 2 include everything indicated for option 1, as well as the following:

  • Upgrade all computers to Windows 10.
  • For each Storage Spaces pool to be created (likely a pool for each of the two external drives to begin with):
    • Acquire a USB 3.0 hub that has its own power source and a minimum of four USB ports (six or even eight is ideal for future expansion).
    • Acquire six new external hard drives that are equal in capacity to those used in option 1 (e.g. eight external drives in total).
    • Follow Windows 10 instructions to create a parity Storage Space for three disks only; (do not use the drive in option 1 if it currently contains content). State a “Maximum Storage Space Capacity” that is sufficient to hold all digital assets in the foreseeable future (the maximum virtual size is 63 TB, but something less than this number is prudent; 40 TB should handle all digitized content Medalta can generate, at which point both the policy and plan should be revisited). Medalta does not currently need 40 TB worth of drive space to state this capacity. The number simply represents the maximum capacity to which the pool of disks can be expanded in the future.
    • Copy all content from the older drive (described in option 1) to the new pool.
    • Follow Windows instructions to add the older drive (i.e. the drive already being used in option 1) as a fourth drive to the pool. This will reformat the older drive. Then copy existing information from the pool onto it.
  • Repeat the above process for a second pool (and copy the contents of the second external drive described in option 1 to this pool).
  • Pools should then be treated in the same manner as individual drives described in option 1. However, drives that are frequently used do not need to be automatically replaced every five years; instead the pool will inform users as drives need replacing.

Option 2 Pros:

  • All pro considerations for option 1 (although more expensive than option 1).
  • Implemented and maintained without special technical knowledge.
  • Expandable beyond the limits of commercially available disk capacities.
  • More robust than single drives (handles disk degradation better).
  • Longer useable lifespan of disk drives than option 1, as they can safely be used right up to the point that degradation is first detected.

Option 2 Cons:

  • All con considerations for option 1, although there are no limitations to storage capacity.
  • Upfront costs are greater than those of option 1.
  • A Storage Space pool is more complicated to manage than a single disk.
  • Storage Space pools are not backwards compatible with earlier Windows operating systems (an operating system prior to Windows 8 will not be able to read the drives).
  • Risk of the unknown: the option recommends two pools of four drives each (to start with), yet CHIN has not tested pools of more than four drives. External sources show eight drives working in a similar configuration and, in theory, significantly more should be possible.

Option 3 – Same as Option 1 or 2 but use a Cloud Service in Lieu of the Second External Drive

This solution further increases the robustness of the system by ensuring that the second preservation copy is never in the same location as the first and by reducing the likelihood that preservation copies will be destroyed (the cloud service will perform its own backups).

Google Drive and Dropbox are examples of viable cloud service options. Dropbox offers 1 TB of cloud storage for $17/month, which will be sufficient initially. The service promises to provide as much space as required for that fee but has not yet replied to CHIN’s queries for further information.

Bandwidth and data limits with Medalta’s Internet service provider need to be reviewed before this option is selected. A speed test on the Internet connection can be established by performing an Internet search on [Internet speed test], selecting one of the test services and following instructions.

If upload speed is 10 Mbps (typical for basic high speed), then a 100 GB upload (a reasonable estimate of content size after one year of digitizing) will take just over 22 hours, making the process workable in the foreseeable future. Should the amount of data eventually become too cumbersome, either a higher speed Internet connection might be procured or the museum might revert back to options 1 or 2 (using a hard drive or a pool of drives for preservation copies) without any penalty.

Option 3 Pros:

  • All pro considerations for options 1 and 2 with the exception of cost (option 1 is most affordable).
  • Increases robustness of system by moving preservation copies offsite and by ensuring a third party is responsible for backups of this copy.
  • Easily implemented.

Option 3 Cons:

  • All cons considerations for option 1 except for the limit to storage capacity.
  • More expensive than option 1.
  • Bandwidth and data limits may be an issue with the Internet service provider.
  • Cloud storage located in the United States is subject to the Patriot Act.

Option 4 – OAIS-compliant Model Managed by Medalta

This option focuses on software to provide a formal digital archive that is compliant with Open Archival Information System (OAIS) standards. The simplest form of software necessary to create an OAIS-compliant model is considered. Archivematica was reviewed for this purpose and found to be substantially too onerous (in terms of installation, costs and operational complexity) to be practical for any museum, save possibly the largest. No other OIAS-compliant applications were found to be substantially simpler to operate.

Pros and cons for this option are not considered, as the option is not realistic for a medium-sized museum.

Option 5 – OAIS-compliant Model Managed by an External Service

Canadiana (canadiana.org) has begun offering a digital preservation service using its own trusted digital repository (TDR) and expects to make this commercially possible for museums sometime in 2016. The base fee structure for a 1 TB storage space using this service will be $1,000 per annum, with a three-year commitment. While this is more expensive than options 1, 2 and 3, it would preserve the museum’s digital content in the securest manner possible, a trusted digital repository.

The details of which assets could be preserved in the Candiana TDR and in what formats are not yet clear. There is little value in submitting PastPerfect records to this service, but many of the other digital asset groups (2 though 10) which are not being accessioned in PastPerfect may be suitable.

The decision to use Canadiana’s TDR will depend on the value of preserving and being able to access information at the object-level (e.g. searching for a record within several archived files) and weighing this against the cost of the service and the labour involved in preparing an object and its metadata.

Selecting an option (1 through 3 in the short term) does not preclude Medalta from revisiting this option at a later date or adopting a hybridized solution (where some assets are preserved using options 1, 2 or 3 and others are preserved on the TDR).

Selected Action Plan

The use of cloud storage (option 3), which is highly flexible to varying storage capacity requirements, was rejected out of principle. Namely, all larger cloud storage services use servers located in the US (making them subject to the Patriot Act); Medalta does not wish to expose its data (sensitive or otherwise) to anyone without its expressed consent.

Option 4 has already been deemed untenable for reasons already mentioned.

Accordingly, the selected solution for Medalta is a hybridization of options 1 and 2, with an acknowledgement that option 5 may become viable as details of Canadiana’s commercial TDR service are made clear. Specifically, start with option 1 by using two external hard drives. Should storage requirements exceed existing drive capacities, convert each drive to a Windows 10 Storage Spaces pool.

Assets that might eventually be stored in Canadiana’s TDR can, in the short-term, be preserved using options 1 or 2 without additional labour or investment. These assets can then be migrated to the TDR if and when Medalta chooses to do so.

Discussion and Details of the Proposed Option

The disk storage capacity in option 1 is sufficient for Medalta’s foreseeable future. In carrying out this option, files in all asset groups will be copied to a 5 TB external hard drive, and copies will be made to a second (redundant) drive.

Preservation copies (i.e. a copy of all assets groups made once each year and accompanied by checksum data) are never to be overwritten.

Shorter-term backups will be made on a weekly and monthly basis. Unfortunately, keeping a straight copy of these shorter-term backups will quickly outstrip drive capacity. Conversely, overwriting them will jeopardize the ability to recover data that was lost earlier in an annual cycle (i.e. data that was created since the last preservation copy was made, but early enough to have since been overwritten).

For these reasons, intelligent backup software is used. Requirements for the backup software are:

  • Perform incremental (delta) copying to reduce time and disk space requirements.
  • Allow deleted files in a source drive to be retained in an archive on the destination drive.
  • Identify when a file has been renamed or moved on the source drive (so that renamed files are not treated as deleted files and then unnecessarily added to the target drive’s archive).
  • Produce identical unencrypted copies of entire files that can be accessed without the use of the backup system or any other specialized software.
  • Run as a batch at predetermined times (e.g. outside office hours).
  • Use file fixity measures to recognize when a file has been changed on either the source or destination drives.
  • Allow entire multiple directories to be specified as sources to be backed up.
  • Allow modified files to be retained in the backup as a version (i.e. revision control).
  • Be easy to use, both in setup and on an ongoing basis.
  • Does not slow a system during office hours.
  • Be within the budget of Medalta Museum.

CHIN tested a number of personal backup software packages within the museum’s budget and identified Bvckup 2 as one package that meets the above requirements.

Using this software, a backup job can be scheduled to run on the office machine once a week (e.g. Friday after hours). Files to backup from this machine include all asset groups except collections management records (which are now managed online via PastPerfect). The job performs the following tasks:

Managing new files

Newly created files on the working machine (the “source”) will be copied to a similar directory on the target drive (i.e. External Drive 1).

Managing deleted files

Deleted files are never lost. Any file that is deleted on the working drive is copied to an “Archive” folder on the backup drives.

Managing moves and renames

The system is smart enough to tell the difference between a deletion and a file rename. Renamed files on the working drive will also be renamed in any backups (note that there will also be preservation copies where nothing is changed, renamed or deleted for any reason).

Unwanted propagation of overwrites

Unfortunately, as an unwanted property of the backup software chosen, any file that is overwritten (after the weekly backup has been run) will also be overwritten on the target drive. Version control is discussed further on as a method of addressing this “overwrite risk.”

Unwanted propagating of changes to databases

This is another unwanted property of the chosen backup software: changes to databases are propagated to the backup drives in the same way as a change to any file. However, because the file name of a database does not change, version control is more difficult to manage. The only known database identified in Medalta’s digital inventory has just been migrated to PastPerfect Online, so the issue may be moot. However, in the event that another database is used, a method of addressing unwanted changes to databases is discussed below.

Risks Inherent in the Scheduled Backup Software and the Creation of Preservation Copies

To address the risk of propagating overwrites, version control should be used where possible: when saving a modified file (be it an image file, audio, text etc.), it should always be saved to a new name. If this is not done, the file it overwrote can still be renamed before the weekly backup is run. The target drive will then keep both versions (the overwritten version being automatically moved to an “Archive” folder).

Likewise, to address the risk of unwanted changes to databases, regular backups of these files can be made on a weekly basis (always keeping the last 4 weeks of backups) and on a monthly basis (always keeping the last 12 months of backups). Weekly copies should be overwritten in subsequent months so that no more than 4 weekly copies exist. Similarly, no more than 12 monthly copies should exist.

There is also the risk that an overwrite or an erroneous change to a database is not noticed in time to correct the mistake before the “rolling” backups described above are overwritten. To address this, permanent preservation copies of all content should be made once a year, starting immediately after the first backup. MD5 checksums should be included with the preservation copies to ensure file fixity. MD5summer software was tested by CHIN for this purpose and should be run on all files in a preservation directory (immediately after the copy was made). The second external hard drive (or pool, if applicable) should be updated once a month to ensure all its content matches that of the first external drive (or pool). Details of these procedures can be found in Annex A.

Managing Preservation Copies

At the end of the fourth year, the directories on both external hard drives should appear as follows:

<Current Backup – Non Database Content>

<Database backup week 1>

<Database backup week 2>

<Database backup week 3>

<Database backup week 4>

<Database backups 01 January>

<Database backups 02 February>

.

.

.

<Database backups 12 December>

< Preservation copy backed up 20150430>

< Preservation copy backed up 20160430>

< Preservation copy backed up 20170430>

< Preservation copy backed up 20180430>

Note that contents of the “Current Backup” folder change weekly, and this will need to be overwritten on the second external drive each time the two are synchronized. Conversely, contents of folders entitled “Preservation copy backed up …” do not change, and for that reason, these only need to be copied to the second external hard drive once. Never overwrite a preservation copy.

The first of the two external hard drives can be stored in the office, and the second should be kept offsite, out of Medalta’s floodplain.

Testing Fixity

Once a year, all checksum information should be tested to ensure files in preservation folders have not changed (i.e. testing for fixity). This can be done with the MD5summer software. The protocol for dealing with a changed file will depend on what is discovered. If, for instance, a number of files on a drive are found to be unreadable, it indicates a problem with the drive itself, and the drive should be replaced. If, however, a single file has a changed checksum and is still readable, it may be nothing more than an inadvertent file write, and the original file might be recoverable from an alternate preservation copy.

Refreshing Media

Hard drives should be replaced (i.e. content should be refreshed onto new media) every five years when using option 1 or when disks in option 2 are stored for long periods (this should not happen if following the proposed procedures).

If using option 2 (Storage Spaces), Windows will inform users when drives should be replaced.

When the external hard drives are first purchased, labels should be added to them, indicating what the drive contains (i.e. Preservation Copies 1 Working Machine / Preservation Copies 2 Working Machine). The purchase and replacement dates should also be added to the label (e.g. “Purchased MMM YYYY” [Abbreviation: alphanumeric date notation with 3 letters for the month, followed by 4 digits for the year; e.g. Mar 2016] “Replace on MMM YYYY” [Abbreviation: alphanumeric date notation with 3 letters for the month, followed by 4 digits for the year; e.g. Mar 2016], where the replacement date is five years from the drive’s date of purchase).

The computer’s internal drives should also be replaced, but this is likely to happen with regular renewal of office equipment and is less crucial than refreshing drives holding the preservation copies.

When preservation drives are replaced within a MS Windows storage pool (option 2), nothing special needs to be done (follow the instructions provided by the operating system). When storage pools are not being used, a drive can be replaced by performing a straight copy of everything from the older drive to the newer one (connect both drives to a machine via its USB ports, open Windows Explorer, then drag and drop all folders to the new drive). The MD5summer software should be used prior to refreshing to ensure file fixity (i.e. to ensure that files have not changed in any way).

File Formats and Data Migration

A list of software applications and their associated file formats should be maintained using the Digital Asset Inventory Template.If an existing application or file format is being replaced, a file migration plan will need to be developed at that time. Also, when planning a migration to new file formats or software, the use of formats conducive to long-term preservation should been kept in mind.

Administrative documents that are being retained for no more than seven years are not likely to require a migration plan, as newer office productivity software (word processing, spreadsheets, etc.) is sufficiently backwards-compatible to read and open any document produced within this time period. Administrative documents that are to be kept for longer periods may require a migration plan.

Hardware Diagram of the Proposed Solution

Starting with Option 1

Image description (Diagram 1): This diagram depicts a hardware configuration of an office personal computer (working drive) linked via USB to a 5 TB hard drive, labelled “External HD 1.” A second USB link from the office machine runs to a second external 5 TB hard drive (labelled “External HD 2”), which is generally not left connected to the personal computer but is kept offsite. An Internet connection also connects the PC to the museum’s PastPerfect Online collections management system.

Hardware diagram 1 of the proposed solution

Migrating to Option 2

Image description (Diagram 2): This diagram depicts a hardware configuration which is identical to that shown in diagram 1 (above) with the following changes: the first external hard drive has been replaced with a pool of three 5 TB hard drives. Likewise, the second external hard drive has also been replaced with a similar pool of drives. The resiliency setting used for both these pools is “parity,” which is the recommended setting for archival use. Both pools can be expanded as required.

Hardware diagram 2 of the proposed solution

Detailed procedures of this recommended solution can be found in Annex A of this document.

Annex A – Summary of Procedures for Option 1

Option 1 Procedures

The proposed solution recommends starting with procedures for option 1, then migrating to option 2 if the need for storage space outstrips the size of commercially available external hard drives.

To do immediately:

  • Complete the migration to PastPerfect Online (done).
  • Ensure that accountability to preserve collections management records rests with PastPerfect. If assurances of accountability are not sufficient, establish a protocol to obtain a regular copy of the records (in tab-delimited or a similar format) and store this copy in a secure location, such as the museum’s office.
  • Designate an office machine (to be called the “working machine”) to host a shareable hard drive (“working drive”), and ensure all digital assets reside on this drive.
  • Clean up the physical media – move content on any existing CDs to the working drive.
  • For all digitized images, audio and film: use a file naming system that corresponds to the accession number of any accessioned artefact the content represents. For each artefact that is born-digital (i.e. digital content that is an artefact in its own right and does not represent a physical object being held by the museum), create a new record in PastPerfect.
  • Ensure password protection at the operating system level (user sign-on screen when booting or logging onto a computer) for all machines.
  • Download MD5summer checksum software (free) and Bvckup 2 backup software ($50) and install both on the working office machine.
  • Acquire two additional external (USB) hard drives (5 TB each) for backup and preservation copies. Label the first of these drives as “Preservation Copies 1 Purchased MMM YYYY [Abbreviation: alphanumeric date notation with 3 letters for the month, followed by 4 digits for the year; e.g. Mar 2016] - Replace MMM YYYY [Abbreviation: alphanumeric date notation with 3 letters for the month, followed by 4 digits for the year; e.g. Mar 2016]” (where replace date is five years from the purchase date), and keep it connected to the working machine in the main office. Label the second of these drives as “Preservation Copies 2 Purchased MMM YYYY [Abbreviation: alphanumeric date notation with 3 letters for the month, followed by 4 digits for the year; e.g. Mar 2016] - Replace MMM YYYY [Abbreviation: alphanumeric date notation with 3 letters for the month, followed by 4 digits for the year; e.g. Mar 2016]” and keep it offsite.
  • Use Bvckup 2 software to schedule a regular weekly backup of all non-databasedigital assets to the “Preservation Copies 1” external drive. The backup of all this content should be stored in a directory on the “Preservation Copies 1” drive entitled <Current Backup – Non Database Content>. Important settings for this backup include:
    • Archive any files that were deleted from the working drive.
    • Rename or move any files that were renamed or moved on the working drive.
  • For any active database that resides on the working drive (if for instance asset group 5’s building #13 database is written to), additional monthly and weekly backups should be scheduled. This is necessary because renaming database files (i.e. version control) is often difficult, depending on how the database is used; anytime a database is written to without version control, the previous version is permanently lost.

Four separate weekly backups should be scheduled to write to the following directories on the “Preservation Copies 1” external hard drive:

<Database backup week 1>

<Database backup week 2>

<Database backup week 3>

<Database backup week 4>

Each of these backups should be scheduled to run once every four weeks. This creates a weekly history of the database files that is up to four weeks old at any given time. In addition to running at the scheduled time, the first of these backups should be run immediately.

  • Likewise, for any active database files residing on the working drive, twelve monthly backups should be scheduled in Bvckup 2 to the “Preservation Copies 1” hard drive. Each of these backup jobs runs only once per year (at the end of the month corresponding to that backup job). Directories on the “Preservation Copies 1” external hard drive should appear as:

<Database backup 01 January>

<Database backup 02 February>

.

.

.

<Database backup 12 December>

In addition to running at the scheduled time, the current month’s backup should be run immediately.

  • Once the first weekly and monthly backups have been run for the first time, create an annual preservation copy on the “Preservation Copies 1” drive, as per the “To Be Performed on an Annual Basis” section below.
  • Using Bvckup2, schedule a job that runs only when manually executed. This job will be to copy the “Current Backup Non-Database Content” folder from “Preservation Copies 1” drive to “Preservation Copies 2” drive. Use this instead of the Windows copy feature, as it will be much quicker in future months.
  • On the “Preservation Copies 2” drive, maintain copies of the monthly backups, as well as copies of the annual preservation content. Keep this drive offsite, out of the floodplain.
  • Ideally, all three drives (working, “Preservation Copies 1” and “Preservation Copies 2”) should not be connected to the same computer at the same time, and copies from “Preservation Copies 1” to “Preservation Copies 2” should be carried out on a separate (virus-free) computer in the office or nearby building. Barring this, care should be taken to minimize the amount of time that all three drives (or sets of drives) are connected to the same machine.
  • Depending on the operating system and brand of external hard drive used, a security measure may be available to password protect the drive. If so, it should be used on both external drives to prevent unauthorized access. Ensure that the password is available to select staff.

Perform the following as time permits:

  • Digitize any analog media (cassette, film,  reel-to-reel, etc.) as time and budget permits. Keep working copies on the working drive in the office.
  • For digitized images, audio and film, create a naming system that ensures each resource has a unique file name. An ideal file name convention is one that contains the record ID for any database with which it is associated (an accession code for instance). This allows digitized content to be quickly located through database searches and database content to be quickly identified for a given file.

To be performed routinely

  • Migrate to new file and software formats as required. See the “File Formats and Data Migration” section of this document for details.
  • Any time that a migration to new software takes place, ensure the migrated files with new formats (for the newer software) are in directories on the working drive that will be backed up and preserved.
  • Maintain existing password protection on the working office machine and any machine that accesses it through a network.
  • Keep the first external hard drive (“Preservation Copies 1”) connected to the working machine in the museum office.
  • Keep the second external hard drive (“Preservation Copies 2”) offsite, out of the floodplain.
  • Use version control (i.e. name files according to the sequence in which they were edited either with a version number or a modification date) on all documents that will be edited.
  • Immediately rename any file that mistakenly overwrote another, so that the original file will be archived on the backup drives.

To be performed on a weekly basis

  • Review and verify that all files in the identified asset groups are within the scope of the weekly backup (i.e. that all the relevant directories are listed in the weekly backup). Add any that are not to all four weekly backups as well as all twelve monthly backups.
  • If active databases exist, ensure that the most recent weekly backup to “Preservation Copies 1” was properly run (check the Bvckup 2 log, and check the “modified” date of files in the target directories).

To be performed on a monthly basis

  • If active databases exist, ensure that the most recent monthly backup to “Preservation Copies 1” was properly run (check the Bvckup 2 log, and check the “modified” date of files in the target directories).
  • Connect both external “Preservation Copies” hard drives to a new virus free computer (this is ideal, but if a second computer is not available, using the working machine for the following procedures is acceptable).
    • Use Bvckup 2 to copy the <Current Backup – Non Database Content> folder from the “Preservation Copies 1” drive to the “Preservation Copies 2” drive. This will take a long time the first time it is done, but if the same Bvckup 2 job is used in the future, it will be much quicker.
    • Copy any new annual < Preservation copy backed up yyyymmdd> directories (as described in the “Managing Preservation Copies” section of this document) from the “Preservation Copies 1” drive to the “Preservation Copies 2” drive (using the copy or drag and drop functions in Windows Explorer).
    • If active databases exist, make a copy of the most recent monthly backup on “Preservation Copies 1” to “Preservation Copies 2” (overwriting any backup for the same month in the previous year).
  • Reconnect the “Preservation Copies 1” drive to the working machine, and return the “Preservation Copies 2” drive to its offsite location.

To be performed on an annual basis

Update the Digital Asset Inventory

Identify any new digital assets, as well as any changes to existing ones. Make particular note of any changes to software used. If there are changes, identify what files (if any) need to be migrated to new formats. Migrations should first be performed on copies found in the working drive. Next, any files found in “Preservation Copies” (and which are no longer on the working drive) can be migrated.

Update the Digital Asset Inventory annually

  • Starting immediately after the first backup on the working machine, copy (using Windows Explorer) the “Current Backup” folder and all its subfolders located on the external hard drive labelled “Preservation Copies 1” to a new folder on the same hard drive named “Preservation copy backed up YYYYMMDD” (where YYYYMMDD is the date the backup was made).
  • If any active database files exist, starting immediately after the first monthly backup on the working machine, copy (using Windows Explorer) this backup (e.g. <Database backup nn Month>) to the newly created “Preservation copy backed up YYYYMMDD” directory (described above).
  • Run MD5summer on the new “Preservation copy backed up YYYYMMDD” folder to store checksum metadata for it, and store this checksum data in the new directory.
  • The monthly update of the offsite hard drive (“Preservation Copies 1”) should be carried out shortly after the above steps are completed.
  • Test fixity of all preservation folders on both external hard drives, as per the “Testing Fixity” section of this document.

To be performed once every five years

  • Acquire new external hard drives.
  • Test fixity of files in preservation copies of the current hard drives.
  • Copy all folders to the new hard drives, and dispose of old external hard drives, as per the “Refreshing Media” section of this document.

Annex B – Summary of Procedures for Option 2

This option involves the use of Windows Storage Spaces disk pools in lieu of single drives. Use these procedures only when the required storage space of external hard drives described in option 1 exceeds the capacity of individual hard drives currently available on the market.

Create Disk Pools

  • Upgrade to Windows 10 on all machines that will access the shared drive.
  • For each pool to be created:
  • Acquire a USB 3.0 hub that has its own power source and a minimum of four USB ports (six or even eight ports is ideal to allow for future expansion).
  • Acquire three new external hard drives that are equal in size to that used in option 1 (i.e. six new drives in total). Label all new drives with the date purchased (e.g. “Preservation Copies Pool 1, disk 1, date purchased YYYY-MMM-DD” [Abbreviation: alphanumeric date notation with four digits for the year, followed by three letters for the month, followed by two letters for the day; e.g. 2016-Mar-01]).
  • Follow Windows 10 instructions to create a Storage Space (set the resiliency type to “parity”) for three disks only. Do not use the drive in option 1 if it currently contains content. State a “Maximum Storage Space Capacity” that is sufficient to hold all digital assets in the foreseeable future (e.g. 40 TB). You do not currently need 40 TB worth of drive space to state this capacity; it is simply the maximum capacity to which the drive can be expanded in the future.
  • Copy all content from the older drive (described in option 1) to the new pool.
  • Follow Windows instructions to add the older drive (i.e. the drive already being used in option 1) and a fourth drive to the pool. This will reformat both drives, and copy existing information from the pool onto them.
  • Repeat the above process for a second pool (and copy the contents of the second external drive described in option 1 to this pool).

Using Disk Pools

Pools should be treated in the same manner as individual drives described in option 1.

Replacing Drives

Drives within a pool that are frequently used (i.e. frequently read and written to) do not need to be automatically replaced every five years; instead the pool will inform users as these physically degrade. If drives within a pool are stored for several months without read or write access (this should not be the case under recommendations for options 1 and 2) then drives should be replaced within five years of the date purchased.

Annex C – Summary of Procedures for Option 5

This section is to be fleshed out if Medalta elects to preserve some or all of its content on a trusted digital repository (TDR) such as Canadiana. The migration to such a TDR can take place at any time and can be done in conjunction with options 1 or 2.

Return to Medalta Museum Case Study

Page details

Date modified: