Scanned document record (First Draft)
March 23, 1994
By Jerry Saltzer

Expanded from ideas suggested at a Library 2000 group meeting on March 17,
1994.  Discussants:  Jack Eisan, Mitchell Charity, Ali Alavi, Sally
Richter, Mary Anne Ladd, Jeremy Hylton, Geoff Seyon, Eytan Adar, Greg
Anderson, Jerry Saltzer.  (I didn't take attendance, so this is from
memory.  Whom did I miss?)

The objective of this document is to define the information that should be
captured when a document is scanned, as an on-line record that is
effectively part of the scanned image form.  The format of the information
is not defined here, only its content; even that is defined only by
example.

The information to be captured falls into three general categories:
scanning conditions, document source information, and an array of image map
entries.


I.  Scanning conditions:

Name of scanning condition set:  standard
Scanner:  Fujitsu 9076E with 8191 document feeder
Software:  Optix version 3.1
Resolution:  400 dpi, 8-bit grey
Settings:  automatic brightness/contrast
Organization:  M.I.T. Document Services
Operator:  Jack Eisan
Date:  March 17, 1994
Notes:
End of set:  standard

Name of scanning condition set:  color
Scanner:  HP IIcx with 1801 document feeder
Software:  Photoshop 2.5 LE and DeskScan version 2.0
Resolution:  400 dpi, 24-bit color
Settings:  automatic brightness/contrast/color-balance/gamma
Organization:  M.I.T. Document Services
Operator:  Jack Eisan
Date:  March 17, 1994
Notes:  Scanner replaced by factory March 15, 1994.
End of set:  color


II.  Document source information

Document Label:  MIT LCS TR-87

Name of source:  Source1
Description:   typed sheets on thesis bond, intended for duplex reproduction.
Size:  8.5 by 11 inches
Count:  121 sheets
Duplex:  no
From:  LCS publications office.
Date:  August, 1964
Notes:  Page three includes a tipped-in color photograph.
end of source:  Source1

Name of source:  Source2
Description:  offset reproduction
Size:  8.5 by 11 inches
Count: 61 sheets
Duplex:  yes
From: MIT Library system, Archive copy.
Date:  circa 1980
end of source:  Source2

Name of source:  greytest
Description:  IEEE standard GS-1994.2 grey-scale target
Size:  8.5 x 11 inches
end of source: greytest

Name of source:  colortest
Description:  Kodak sQ-13 color separation card
Size:  3 x 8 inches
end of source:  colortest

Name of source:  blank
Description:  Document Services standard blank page replacement
end of source:  blank


III.  Scanned image map:

image name           source    sheet/side   original    condition set  notes
                                           pagination

MIT LCS TR-87 1      greytest                  --         standard
MIT LCS TR-87 2      source1      1/1           1         standard
MIT LCS TR-87 3      source1      2/1           2         standard
MIT LCS TR-87 4      source 1     3/1           3         standard
MIT LCS TR-87 5      colortest                 --         color
MIT LCS TR-87 6      source1      3/1           3         color           1
MIT LCS TR-87 7      blank                     (4)        standard
MIT LCS TR-87 8      source2      3/1           5         standard
   ...
MIT LCS TR-87 249    source1    121/2         241         standard

Notes:
1.  Image outside the outline of the color photograph was digitally masked out.


Comments:

Sheet 3 of source 1 (the one with the tipped-in color photo) was scanned
twice, once with the grey-scale scanner and once with the color scanner;
both images are included in the scanned version of the document.  Question:
 in the grey-scale image, should the picture be replaced with a note saying
that there is a color image available?  Ideally, we should have enough
information here that a clever browser can put the page back together again
on the screen.  Suggestions are in order!)

The back side of sheet 2 of source2 was blank, and therefore replaced with
a blank target.  Since the next sheet carried the page number 5, page
number 4 is implied, which is indicated by placing it in parentheses.

If the next sheet had carried the page number 4, the original pagination
would have instead been shown as "--".

It is not clear whether or not a blank target would have been appropriate
if the original typed form had been intended for single-side reproduction,
with all pages consecutively numbered, but the reproduction was done on two
sides, with blank pages introduced as necessary to get chapters to start on
right sides.