Scanned Document Record Revision History

Revision History of the Library 2000 Scanned Document Record Specification

Changes to produce Version CSTR 1.4 from CSTR 1.3, February 2, 1996.

1. Added optional "Touched up" field to record fact of touchup and any surrounding circumstances.

2. Extended convention on image number to include trailing letter, which may be used to indicate touched-up versions.

3. Explicitly stated that images are intended to be viewed in order by image number.

4. Removed some of the open questions and revised wording of others.

5. Specified that a zero checksum is to be ignored, in accordance with current practice of placing checksums in a separate file.

Intermediate changes to Version CSTR 1.3, January 26, 1995 (practice unchanged, so version number unchanged.)

1. Changed content specifier "control" to "doccontrol" to agree with example.

2. Removed an unneeded requirement that the map entries be in order by image number.

3. Added a note about when version numbers should be changed.

4. Changed the documentation of version field label to match the example.

5. Clarified description of when a page is considered numbered, rather than unnumbered.

6. Updated list of open questions.

7. Described better the order of the map fields in the case of double-sided input.

Changes to produce Version CSTR 1.3 from CSTR 1.2, January 11, 1995:

Note: CSTR 1.2 was incompletely implemented and was used on only a handful of scanned TR's. The checksum and filelength fields were always zero, the format documentation map entry identified a non-existent file, and the version field label overflowed the column width.. The version number was incremented 1.3 to coincide with correct implementation.

1. Changed the field identifier "Scanned Document Record Version" to the shorter "Scanning Record Version" to avoid a field overflow problem in Excel.

Changes to produce Version CSTR 1.2 from CSTR 1.1, November 25, 1994:

1. Added the integer checksum of the image file to the map field.

2. Added the file length to the map field.

3. Added a format documentation file to the example map.

Changes to produce the final Version CSTR 1.1, November 25, 1994: (copy unlocated)

1. Added a "comment" field, and used it to provide column headings for the map in the example.

2. Added a new section on file naming conventions between the old sections 4 and 5.

3. Modified the file names in the example map to match the naming conventions. (Removed string "image", added suffix ".tif", inserted leading zeros in image numbers, added string "LCS".)

4. Removed "scanned document record file name" field, and replaced it with a map entry for the scan record.

6. Added "scanrecord" and "format" to the list of content identifiers.

8. Added to the introduction a specification that image files be in the TIFF format.

9. Added a specification that browsers should ignore unrecognized fields.

Changes to produce the third draft of Version CSTR 1.1, November 18, 1994:

1. The "file name" field is gone.

2. All file names (in the scanning record field and map fields) are now fully spelled out.

4. A new field, named "source", is added to the record. It can take on the values "First-generation original", "Later-generation copy", or "PostScript".

Changes to produce the second draft of Version CSTR 1.1, November 9, 1994:

1. The page count field is gone, on the basis that the concept is ill-defined. The only count field is now the image count, which is a precise concept.

2. The names of the fields "Original form" and "Original size" are now "Input form" and "Input size" to reflect the observation that these are the form and size of the pages actually scanned. The first-generation originals may have had a different form or size.

3. The names of the fields "Intended print form" and "Intended print size" are now "Suggested print form" and "Suggested print size" to reflect the possibility that the original intent may not be known; this is the best guess of the current publisher.

4. There is a new field "Text quality" with the allowed values "light", "dark" and "normal", to capture observations by the scanning operator. This field is a first cut at communicating this class of information from the scanning operator to the display process and it is likely to evolve.

5. The field "file name generator" is now simply "File name". In addition to using this string as the prefix for each image file name, it will also be used conventionally as the name of the folder or directory that contains all the image and related files for a single report. Also, the conventional value of the field no longer ends with a hyphen; the hyphen is understood to be inserted whenever the name is used as a prefix.

6. The value of the scanning record file name field no longer includes the file name prefix, but the file name itself does carry the prefix.

7. The names of files in Map fields no longer include the filename prefix, but the file names themselves do carry the prefix..

8. All file names in Map fields now begin with the string "image-". The original thought of distinguishing image- from other- is better handled by the content descriptor associated with each image.

9. Image numbers in image filenames no longer have leading zeros.

10. The "..." convention is gone; it is replaced with a plan that there will be one Map field for every image of the file. (This change is coupled with a plan to use Excel to create the scanned image record. Excel, given two consecutive examples of image-to-page number correspondences, can easily fill in any number of similar such correspondences.)

11. Problems with particular pages, such as the original being skewed or wrinkled or containing a photo, will be recorded as a comment on the Map entry for that page.

12. The "page" content identifier is renamed "numbered" and is defined as containing the publisher's intended page number (assuming the intent can be determined) whether or not the page number actually appears on the page.

Scanned document record Version CSTR1.1, First draft, October 30, 1994:

A major rewrite of the experimental scan record definition proposed in May, 1994.

Scanned document record Version CSTR1.0 First draft, March 23, 1994 and Second draft, May 27, 1994

The first draft did not carry a version number, but is hereby designated Version CSTR 1.0.

For more information contact Jerry Saltzer <Saltzer@mit.edu>