Chapter 7

Previous chapter

Chapter 7

Future Directions

This chapter describes improvements that could be made to the current prototypes of the digiment generator and browser, as well as extensions and modifications that could be made to the digiment format.

7.1 Improving the Digiment Generator

The current prototype of the digiment generator has some shortcomings, and could be improved in a number of ways. Implementing some of these improvements and features would make the generator much more practical for other institutions and third parties. Although the generator was designed to work specifically with TRs from the MIT Library 2000 project, it was also built with modularity and expansion in mind.

7.1.1 Reducing dependence on MIT server structure
The biggest obstacle to making this software available is reducing its dependence on the MIT TR structure. The digiment generator relies on knowledge of the local filesystem storage structure, as well as file naming conventions in order to determine the types of data objects available for a digiment. The generator also requires PIF files to be available for any image formats. By abstracting out these dependencies into modules, this software could work with any type of storage structure with the creation of an appropriate module. The current system already has some abstractions, but no modules have been created for any environment other than MIT.

7.1.2 Recognizing other body-list types besides Postscript

The current digiment generator only creates body-list parts for Postscript versions of a document. Since the MIT TRs that are on-line only contain Postscript and/or images, this was not a problem for the initial version of the generator. However, once we start doing OCR on the scanned images, we will have text versions of the documents available, and possibly word positioning information as well, depending on the type of OCR used.

Furthermore, other sites may make use of PDF, XDOD (Xerox Document On Demand), or other formats. The generator should be able to be easily modified to recognize the existence of any available format and generate the appropriate body-lists.

7.1.3 More intelligent merging of page-maps

The current digiment generator only does a very simple merging of page-maps. If two page-maps are identical, it will merge them into a single page-map and update any page-lists that referenced them to point to the new page-map. This method works well for scanned images where all the available formats are derived from the same original images and therefore have the same page-mapping structure. However, this system breaks down when trying to relate a single color image to a list of greyscale images. The page-map for the color image will consist of a single entry with the VSN and page number of the color page. The greyscale page-map will consist of entries for each page in the page-list. Since the two page-maps will not be identical, they will not be merged.

The appropriate action to take would be to intelligently merge the two page-maps. This would involve examining the page number from the color image and finding the entry with the same page number in the greyscale page-map. If the two VSNs are identical, then the greyscale page-map can be used for both versions. Having the extra greyscale entries in the page-map would not cause a problem for the color page-list, since any entry that is not in both the page-list and page-map is ignored.

If the two VSNs are not identical, the generator would have to resequence one or both of the VSN series in order to ensure that the color and greyscale image have the same VSN. As long as the new VSNs are in the same relative order, this does not create any problems.

7.1.4 Ability to make page-lists for images without PIFs

Another nice feature would be to simulate a PIF for sequences of images that do not have a PIF associated with them. As long as the images could be ordered in some non-arbitrary way, such as by examining the filename, a pseudo-PIF could be created for them. This PIF would be in the same format as a normal PIF, but would have the value "unknown" for every field other than the image number. This would allow the generator to construct a page-map and page-list for the images, so that they can be viewed, but without any page numbering or cross-referencing of types available.

7.2 Improving the Digiment Browser

The current prototype of the digiment browser does not take full advantage of the digiment format. In addition, using the WWW as a front end has several inherent limitations, most notable the lack of control over the interface. By improving the browser and integrating it with its own interface system, it could be a useful tool for viewing digital documents.

The single most useful enhancement would be to convert the browser from a WWW based system to a stand alone system. This would permit the browser to have complete control over the interface system, instead of relying on HTML and a WWW browser. Most notably, the digiment browser could incorporate code to allow it to handle multiple image types as well as other types of data internally. This would allow the browser to display these pages simultaneously with the rest of the interface, instead of the current system that requires the user click on an additional button to display the data for a specific page.

In addition, a stand alone browser could support multiple windows that it could update at will. Besides the window with the actual document data in it, possible other interface windows could include a window with a list of available pages, a window with bibliographic information for the current document, a window with a list of all available data parts, a window with a list of alternative formats for the current part, a list of recently viewed digiments, or simply multiple data windows at the same time. Since the browser controls the windows directly, it could update the information in them when the user changed pages, data types or digiments.

Trying to fit all of these possible interface options into the single window provided by a WWW based browser would result in a very cluttered interface that would be confusing to the user.

A stand alone browser would also have the feature that it could be used as a WWW helper application, allowing it to fit seamlessly into the existing WWW structure. When a user who is using a WWW browser clicks on a link that points to a digiment, the WWW browser can recognize that the returned object is a digiment and automatically launch the digiment browser. Thus, a user could search for a particular article on-line, click on one of the results, and have the digiment pop up in a new window on his workstation.

7.3 New Extensions and Enhancements to the Application/digiment Type

The digiment specification outlined in this document provides for the basic set of digiment-types that are necessary to display most forms of data, as well as carry bibliographic information and relationships between parts. However, this is not an exhaustive set of types and new digiment-types are expected and welcomed. Some examples of possible new digiment-types are listed below.

7.3.1 A pricing digiment-type

A pricing digiment-type could contain retrieval pricing information for each data object contained within a digiment. This would allow a digiment server to charge for the information it serves, as well as providing real time pricing information to clients. A possible format for this type would be to simply list each data object by content-ID and VSN, along with the cost for retrieval. Users could then choose which version of a digiment to view depending on their preferences for how much they wish to pay. For example, a plain text version and a Postscript version could be charged at different rates.

This would also allow users to request the same digiment from multiple servers to compare pricing structures versus the formats offered and the speed of delivery.

7.3.2 A distribution-rights digiment-type

A similar digiment-type could be created that specified distribution rights or copyrights for each data type. Two different versions of the same information may not have the same copyright or distribution rights. For example, a text version of a document may have unlimited distribution rights while the Postscript formatted version may be restricted. Or a PDF version of the document which is view-only may have different distribution rights than a fully editable version. In addition, this type could be combined with the pricing type to allow companies to charge different amounts for copies of the same document with different distribution rights.

7.3.3 A digiment-structure digiment-type

This data type could contain information specifying the structural layout of a digiment. This could entail containing a list of different document structural parts, such as chapters, appendixes and multimedia annotations, associated with the correct items from the part-lists and body-lists. A part like this would allow a browser to construct a Table of Contents for a digiment, which the user can use to view the digiment.

7.3.4 Enhancing the preferred-order type

Currently, the only description of what a preferred-order type contains is in a human-readable string. It may be useful to add another, structured, label to the preferred-order type, which can be parsed by a program. This would allow a program to automatically scan through the available preferred-order types and pick the most suitable one to use. However, this would require enumerating all possible ways in which preferred-order types could differ, so that a program can determine which one is best suited to its needs. This is not an easy task.

7.4 An Alternative Form for the Page-list Digiment-type

One of the problems with the current form of the page-list is that it does not actually contain the data, it simply references it. While this is desirable for many applications, such as a user desiring simply to browse a document that is stored centrally, it is not desirable for a electronic document delivery systems, which want to deliver a completely self contained document in a single transaction. In addition, although the MIME specification allows the embedding of either a data object, or a reference to a data object stored elsewhere, the current page-list definition uses its own method of referencing external objects, so MIME compliant agents cannot automatically extract the pages. This is because I determined the MIME version of referencing to be too inefficient and too complex for this application. However, it is possible to build page-lists that completely conform to the MIME specification and allow the option of either referencing or embedding the data object. A program that could build page-lists of this form was created along with the standard page-list generator described earlier, but is not being used.

This can be done by storing a page-list as a multipart MIME object, with each page in the list stored as a separate object. Each object has a standard MIME Content-Type and Content-ID. The value of the Content-ID header is used in place of a VSN. If the object is embedded into the digiment, the Content-Type headers contains the MIME type for the data object. If the object is stored externally, a special MIME type of "message/external-body" is used, which is defined by the MIME standard to reference externally stored data objects. Any client that is MIME compliant can resolve this reference and obtain the data object itself.

Using a page-list of this type, a digiment could either reference the data objects that are stored on the server, or embed the data objects within the digiment in order to transfer the entire document in a single transaction. However, a simple digiment containing 50 images in only two different formats would take approximately 30 kilobytes, as opposed to only 5 kilobytes with the current page-list.

This system could also be modified to work with body-lists as well as page-lists.

7.5 The Digiment as Interchange Format

One of the most useful properties of a digiment is the ability to represent an entire document, including all of its representations and variations. The digiment representation is independent of the actual formats used to store the document on a particular server. By promoting this format as an interchange standard, it would be possible to use the digiment format to transfer entire documents from one machine to another. This transfer may be accomplished in a number of ways. One way would be to transfer first the digiment object, then each of the files which it references, by traditional protocols, such as FTP, HTTP or even email.

Another way would be to create a new page-list type, such as the one described in section 7.4 on page 74, which embeds the actual data in the digiment. The digiment could then by transferred by traditional protocols. The advantage of this scheme is that the entire document could be transferred at one time, rather than transferring each of the data files separately.

However, the digiment format could also be built into existing protocols. For example, FTP has an ASCII mode, which reads text in from the host system, converts it into netascii format, and then converts it back into the proper ASCII format on the client system. This conversion to an intermediate format and back is necessary to properly deal with the way different systems signify end of lines. A digiment mode could be added to FTP, which would automatically convert a document into digiment format, transfer it to a client system, and convert it into a format suitable for storage on the client system.

A digiment transfer mechanism should also let the client specify a subset of a digiment. For example, a client should be able to request a digiment, but only the 100 DPI, 5 bit images. The mechanism should then only include the data in the page list which corresponds to the requested type. In addition, a client should be able to request a digiment in a specific preferred-order. In this case, the digiment would only include those subsets of the page-lists and body-lists which are actually referenced by the preferred-order.

A digiment can be made even more portable by using MIME encoding methods. These methods are defined in the MIME standard to permit the transfer of arbitrary binary data between different systems and gateways. By encoding and decoding each of the data parts appropriately, a program can transfer a digiment from one system to another without having to worry about the specific storage formats on either system, or what gateways the data had to pass through.

By either writing programs that use traditional protocols, creating new protocols, or adding new types to existing protocols, it would be possible to transfer any document to another system in the digiment format. Such a document interchange format would be invaluable for transferring documents between repositories or for transferring a new document from a publisher into a repository.