Scanned document record (First Draft) March 23, 1994 By Jerry Saltzer Expanded from ideas suggested at a Library 2000 group meeting on March 17, 1994. Discussants: Jack Eisan, Mitchell Charity, Ali Alavi, Sally Richter, Mary Anne Ladd, Jeremy Hylton, Geoff Seyon, Eytan Adar, Greg Anderson, Jerry Saltzer. (I didn't take attendance, so this is from memory. Whom did I miss?) The objective of this document is to define the information that should be captured when a document is scanned, as an on-line record that is effectively part of the scanned image form. The format of the information is not defined here, only its content; even that is defined only by example. The information to be captured falls into three general categories: scanning conditions, document source information, and an array of image map entries. I. Scanning conditions: Name of scanning condition set: standard Scanner: Fujitsu 9076E with 8191 document feeder Software: Optix version 3.1 Resolution: 400 dpi, 8-bit grey Settings: automatic brightness/contrast Organization: M.I.T. Document Services Operator: Jack Eisan Date: March 17, 1994 Notes: End of set: standard Name of scanning condition set: color Scanner: HP IIcx with 1801 document feeder Software: Photoshop 2.5 LE and DeskScan version 2.0 Resolution: 400 dpi, 24-bit color Settings: automatic brightness/contrast/color-balance/gamma Organization: M.I.T. Document Services Operator: Jack Eisan Date: March 17, 1994 Notes: Scanner replaced by factory March 15, 1994. End of set: color II. Document source information Document Label: MIT LCS TR-87 Name of source: Source1 Description: typed sheets on thesis bond, intended for duplex reproduction. Size: 8.5 by 11 inches Count: 121 sheets Duplex: no From: LCS publications office. Date: August, 1964 Notes: Page three includes a tipped-in color photograph. end of source: Source1 Name of source: Source2 Description: offset reproduction Size: 8.5 by 11 inches Count: 61 sheets Duplex: yes From: MIT Library system, Archive copy. Date: circa 1980 end of source: Source2 Name of source: greytest Description: IEEE standard GS-1994.2 grey-scale target Size: 8.5 x 11 inches end of source: greytest Name of source: colortest Description: Kodak sQ-13 color separation card Size: 3 x 8 inches end of source: colortest Name of source: blank Description: Document Services standard blank page replacement end of source: blank III. Scanned image map: image name source sheet/side original condition set notes pagination MIT LCS TR-87 1 greytest -- standard MIT LCS TR-87 2 source1 1/1 1 standard MIT LCS TR-87 3 source1 2/1 2 standard MIT LCS TR-87 4 source 1 3/1 3 standard MIT LCS TR-87 5 colortest -- color MIT LCS TR-87 6 source1 3/1 3 color 1 MIT LCS TR-87 7 blank (4) standard MIT LCS TR-87 8 source2 3/1 5 standard ... MIT LCS TR-87 249 source1 121/2 241 standard Notes: 1. Image outside the outline of the color photograph was digitally masked out. Comments: Sheet 3 of source 1 (the one with the tipped-in color photo) was scanned twice, once with the grey-scale scanner and once with the color scanner; both images are included in the scanned version of the document. Question: in the grey-scale image, should the picture be replaced with a note saying that there is a color image available? Ideally, we should have enough information here that a clever browser can put the page back together again on the screen. Suggestions are in order!) The back side of sheet 2 of source2 was blank, and therefore replaced with a blank target. Since the next sheet carried the page number 5, page number 4 is implied, which is indicated by placing it in parentheses. If the next sheet had carried the page number 4, the original pagination would have instead been shown as "--". It is not clear whether or not a blank target would have been appropriate if the original typed form had been intended for single-side reproduction, with all pages consecutively numbered, but the reproduction was done on two sides, with blank pages introduced as necessary to get chapters to start on right sides.