The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.05 - September/October (2003 vol.15)
pp: 1277-1294
Kyong-Ho Lee , IEEE Computer Society
Yoon-Chul Choy , IEEE Computer Society
ABSTRACT
<p><b>Abstract</b>—This paper presents a syntactic method for sophisticated logical structure analysis that transforms document images with multiple pages and hierarchical structure into an electronic document based on SGML/XML. To produce a logical structure more accurately and quickly than previous works of which the basic units are text lines, the proposed parsing method takes text regions with hierarchical structure as input. Furthermore, we define a document model that is able to describe geometric characteristics and logical structure information of documents efficiently and present its automated creation method. Experimental results with 372 images scanned from the <it>IEEE Transactions on Pattern Analysis and Machine Intelligence</it> (<it>TPAMI</it>) show that the method has performed logical structure analysis successfully and generated a document model automatically. Particularly, the method generates SGML/XML documents as the result of structural analysis, so that it enhances the reusability of documents and independence of platform.</p>
INDEX TERMS
Logical structure analysis, document image understanding, structured documents, SGML, XML, a syntactic method.
CITATION
Kyong-Ho Lee, Yoon-Chul Choy, Sung-Bae Cho, "Logical Structure Analysis and Generation for Structured Documents: A Syntactic Approach", IEEE Transactions on Knowledge & Data Engineering, vol.15, no. 5, pp. 1277-1294, September/October 2003, doi:10.1109/TKDE.2003.1232278
30 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool