loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
First International Workshop on Document Image Analysis for Libraries (DIAL'04)
Creating Digital Libraries: Content Generation and Re-Mastering
Palo Alto, California
January 23-January 24
ISBN: 0-7695-2088-X
Steven Simske, Hewlett-Packard Laboratories
Xiaofan Lin, Hewlett-Packard Laboratories
This paper has two main goals: to describe the automatic creation of a digital library and to provide an overview of the meta-algorithmic patterns that can be applied to increase the accuracy of its creation. Automating the creation of useful digital libraries — that is, digital libraries affording searchable text and reusable ("re-purposable") output — is a complicated process, whether the original library is paper-based or already available in electronic form. In this paper, we outline the steps involved in the creation of a deployable digital library (<1.2 x 106 pages) for MIT Press, as well as its implications to other aspects of digital library creation, management, use and repurposing. Input, transformation, information extraction, and output processes are considered in light of their utility in creating layers of content. Interestingly, in some aspects, scanning directly from paper offers extra opportunities for error-checking through feedback-feedforward combination. Strategies for quality assurance (QA) at the document, chapter and book level are also discussed. We emphasize the use of meta-algorithmic design patterns for application towards improving the content generation, extraction and re-mastering. This approach also increases the ease with which modules and algorithms are added to and deprecated from the system.
Citation:
Steven Simske, Xiaofan Lin, "Creating Digital Libraries: Content Generation and Re-Mastering," dial, pp.33, First International Workshop on Document Image Analysis for Libraries (DIAL'04), 2004
Usage of this product signifies your acceptance of the Terms of Use.