This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
From Paper to Office Document Standard Representation
July 1992 (vol. 25 no. 7)
pp. 63-67

The principles of the model-based document analysis system called Pi ODA (paper interface to office document architecture), which was developed as a prototype for the analysis of single-sided business letters in German, are presented. Initially, Pi ODA extracts a part-of hierarchy of nested layout objects such as text-blocks, lines, and words based on their presentation on the page. Subsequently, in a step called logical labeling, the layout objects and their compositions are geometrically analyzed to identify corresponding logical objects that can be related to a human perceptible meaning, such as sender, recipient, and date in a letter. A context-sensitive text recognition for logical objects is then applied using logical vocabularies and syntactic knowledge. As a result, Pi ODA produces a document representation that conforms to the ODA international standard.

Citation:
Andreas Dengel, Rainer Bleisinger, Rainer Hoch, Frank Fein, Frank Hönes, "From Paper to Office Document Standard Representation," Computer, vol. 25, no. 7, pp. 63-67, July 1992, doi:10.1109/2.144442
Usage of this product signifies your acceptance of the Terms of Use.