Digitization Systems

Overview and Caution on the Imaging of Public Records

Increasingly, UW System institutions are employing document imaging technologies to digitize paper records for greatly added efficiencies in quick and simultaneous access as well as to reduce the physical storage space needed for paper records.  Despite these benefits, imaging should not be considered a cost saving initiative in most cases.  Indeed, offices must understand that committing to a document imaging process brings added requirements, responsibilities, and costs, not always encountered in the management of paper records.

Specifically, offices will encounter two new costs unknown in the realm of paper records.  The first is the production of the digital files themselves.  This involves both preparing the records for imaging (removing paper clips, staples, etc.), the scanning itself, and the creation of the information about the files needed for storage and retrieval (metadata).  Some offices might also wish to make their digital records key-word searchable which takes added time and software.  While these may be one time costs for each set of records, they are not inconsiderable.

The second set of costs deal with the long term preservation of these records over the retention period.  Due to changing technology records digitized now may not necessarily be accessible within 15, 10 or even six years.  The hardware and software required to read the files may become unavailable or unsupported during the retention period of the records.  This is known as obsolescence (think 3.5 inch floppy discs or 8 track tapes) and to overcome this, records may need to be “migrated” or reformatted and copied to new, supported technologies that are more commonplace in the future.  Similarly, even supported optical or magnetic media used to store the records can become corrupted and fail and thus preventing access to a record.  To prevent this, digital records must be backed up, periodically checked for errors, and if needed, “refreshed” onto new media.  Migration, continual error checking and refreshment are fundamental parts of an electronic records preservation plan and must be budgeted into every imagining project, particularly those with collection with long retention periods (over 7 years).   Offices unable to provide access to older records in their possession could be found in violation of Wisconsin Administrative Rule 12 which requires electronic records must be accessible, accurate, authentic, reliable, legible, and readable throughout the record life cycle.  This can expose the offices--and even its employees--to potential costly litigation and audits.

Therefore, in large part to the dependence on computerized technologies and their constant evolution, digital records are some of the most fragile ever created and so no document imaging program should be initiated without careful planning AND consultation with IT staff, the campus records officer and at times, with the Office of General Counsel for UW System.  These individuals will work with the office to develop a fully-developed imaging process and ensure that the newly created digital records satisfy legal requirements.  

Planning

Offices with the help of campus IT, records management staff and General Counsel will undertake the following steps in the planning and development of an imaging process:

  1. Workflow Analysis: The first step in planning for a document imaging process is a thorough records and workflow analysis to determine and document existing and planned agency information needs.  This includes a cost benefit analysis to determine the cost justification of the activities and the benefits to the agency with their implementation.

  2. Records Scheduling: If it is determined that imaging is cost effective to meet the current and future needs of the office, the records series to be digitized must be scheduled if not already covered under an existing Records Disposition Authorization (RDA) or general records schedule (GRS).  The campus records officer will assist with this.  NO public record should be imaged without a current record schedule in place.

  3. Develop imaging process: With proper records schedules in place, the office can then proceed to create the digitization process itself.   This process should be documented for future reference. Offices are encouraged to ONLY undertake document imaging activities if done through an enterprise-wide document management system. Such systems should be purchased only if they comply with all relevant legal requirements of the Board of Regents Records Management Policy, Wis. Stats. §§ 16.61 & 16.611, and Wisconsin Administrative Code, Chapter Adm 12.   Bidding, reviewing and purchasing such system should not take place without consultation with the campus records officer and the Office of General Counsel for UW System. 

    Whether part of a enterprise-wide document management system or not, the following components must be considered and satisfied in the creation of an imaging process.

    Document arrangement: Prior to scanning, it must be determined how the imaged records will be organized, specifically, the unit of organization for the digital copies. Document imaging is not always a one (source)-to-one (digital copy) process.

    Document preparation: Office must prepare documents for efficient scanning (remove staples, unfold paper, remove extraneous documents, etc.).

    Identification/Metadata: Offices must capture metadata (information about the documents) that will allow the records to be identified, organized, searched and preserved. This metadata may be recorded in something as simple as a file-naming convention or as complex as an indexing system.  Prior to scanning, offices must commit to a metadata scheme that employs consistent data entry practices (names and date formats, etc.). Controlled vocabularies are also recommended.  

    Technical considerations: Offices must decide on the desirable file formats and other technical requirements for scanning, storage, and retrieval. See below for technical recommendations.

    Conversion Process: Offices must limit the individuals authorized to doing the actual scanning to help ensure the records are accessible, accurate, authentic, reliable, legible, and readable throughout the record life cycle. These individuals must be well trained on equipment and software. The conversion process should result in SOURCE FILES that are to be used as the official/preservation copy only.  Subsequent CONVENIENCE COPIES can be created from the source file for use in the office.   

    Quality control: Digital images must be inspected by the office to ensure that they are of sufficient quality to help ensure the record’s accessibility, accuracy, authenticity, reliability, legibility, and readability throughout the record life cycle. In some cases, sampling may be utilized to conduct quality control.

    Storage: Records passing quality control should then be stored in accordance with the storage recommendations below. These requirements are necessary to help fulfill the provision of Administrative Rule 12. 

    Disposition of source documents:  In many cases, the original paper document can be discarded once accessible, accurate, authentic, reliable, legible, and readable digital records are created from them.  This activity needs to be represented in the RDA and needs to be carried out in the manner prescribed by law if records are considered confidential.

    Preservation: A basic preservation plan must be put into place before imaging can begin.  The plan will identify the retention period for the records (taken from the RDA or GRS), how and when the digital files will be tested for errors, and when migration will be considered.

Technical Recommendations

Digitization technologies allow offices to control the resolution, size, color, bit-depth and other qualities that affect how the image appears on a computer screen or is output to a printer.  Furthermore, once captured, a digital image can be saved in numerous file formats that may or may not include compression technologies that reduce the file size of the file. Choices offices make in these areas need to be cost-effective while still producing an accessible, accurate, authentic, reliable, legible, and readable record throughout its life cycle.  It is in UW System’s interest to limit these options in the majority of cases to ensure consistency of work across campuses. The following recommendations for the creation of SOURCE FILES should satisfy MOST offices needs (From National Archives and Records Administration’s Technical Guidelines – pp.32-36). 

Image Types

Bit Depth

Color Mode

Resolution (ppi/spi)

Scale

File
Format

Compression

B&W Text

1-bit

B&W (bitonal)

600 ppi/spi

100%

TIFF

CCITT Group 4

B& W Text with Illustrations (charts, artwork, graphs, photos)

8-bit

Grayscale

400 ppi/spi

TIFF

None

Color Illustrations with Text

24-bit

RGB

400 ppi/spi

TIFF

None

CONVENIENCE COPIES, those that are not used for preservation but to be used in the office, may be of more diverse formats and resolutions in order to best fit the needs of the office.  The office, for example may wish to create JPG or PDFs files from the SOURCE FILES that are of lesser resolution and are compressed for day to day use. 

Storage Recommendations

The storage of converted records is not unlike other electronic public records and as such, decisions must be made to ensure these digital records are accessible, accurate, authentic, reliable, legible, and readable throughout their lifecycle, all requirements of Wisconsin Administrative Rule 12. As mentioned above, these decisions should be documented in a basic preservation plan made in consultation with IT staff, the campus records officer and, at times, the Office of General Counsel of UW System.  This is especially needed of record series with extended retention periods (more than seven years). A major component of the preservation plan will be how and where the records are stored.

It is recommended that SOURCE FILES of digitized records be stored on a network server or as part of an enterprise-wide document management system and NOT on removable media.   If no document management system is available, offices should construct a file system on the server that allows for efficient record retrieval and management.  Appropriate campus IT staff must be notified if the official copy of any public record is to be stored on campus servers.  If, for whatever reason, it is not possible to preserve SOURCE records on a server, it is recommend that these records be stored on removable WORM (write once, read many: e.g. CD-R, DVD-R, NOT CD-RW, DVD-RW, Zip Disc, USB drives) discs only. At least 2 copies of each disk should be made and kept in separate, secure, locations.

For both SOURCE FILES and CONVENIENCE COPIES additional efforts must be made to ensure the security of the data from tampering and/or theft. Campus IT, risk management staff and the Office of General Counsel for UW System must be consulted for advice on securing records.

This document is borrowed, in part, from: