Seventh International Conference on Document Analysis and Recognition (ICDAR'03) - Volume 2
Graph Grammar Based Analysis System of Complex Table Form Document
Edinburgh, Scotland
August 03-August 06
ISBN: 0-7695-1960-1
Structure analysis of table form document is important because printed documents and also electronical documents only provide geometrical layout and lexical information explicitly. To handle these documents automatically, logical structure information is necessary. In this paper, we first propose a general representation of table form document based on XML, which contains both structure and layout information. Next, we present structure analysis system based on graph grammar which represents document structure knowledge. As the relation between adjacent fields in table form documents become two dimensional, two dimensional notation is necessary to denote structural knowledge. Therefore, we adopt two dimensional graph grammar to denote them. By using grammar notation, we can easily modify and keep consistency of it, as the rules are relatively simple. Another advantage of using grammar notation is that, it can be used for generating documents only from logical structure. Experimental results have shown that the system successfully analyzed several kinds of table forms.
Citation:
Akira Amano, Naoki Asada, "Graph Grammar Based Analysis System of Complex Table Form Document," icdar, vol. 2, pp.916, Seventh International Conference on Document Analysis and Recognition (ICDAR'03) - Volume 2, 2003