This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Geometric Structure Analysis of Document Images: A Knowledge-Based Approach
November 2000 (vol. 22 no. 11)
pp. 1224-1240

Abstract—Geometric structure analysis is a prerequisite to create electronic documents from logical components extracted from document images. This paper presents a knowledge-based method for sophisticated geometric structure analysis of technical journal pages. The proposed knowledge base encodes geometric characteristics that are not only common in technical journals but also publication-specific in the form of rules. The method takes the hybrid of top-down and bottom-up techniques and consists of two phases: region segmentation and identification. Generally, the result of the segmentation process does not have a one-to-one matching with composite layout components. Therefore, the proposed method identifies nontext objects, such as images, drawings, and tables, as well as text objects, such as text lines and equations, by splitting or grouping segmented regions into composite layout components. Experimental results with 372 images scanned from the IEEE Transactions on Pattern Analysis and Machine Intelligence show that the proposed method has performed geometric structure analysis successfully on more than 99 percent of the test images, resulting in impressive performance compared with previous works.

[1] L. O'Gorman and R. Kasturi, Document Image Analysis. IEEE CS Press, 1995.
[2] G. Nagy, “Twenty Years of Document Image Analysis in PAMI,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 1, pp. 38-62, Jan. 2000.
[3] A. Yamashita, T. Amano, Y. Hirayama, N. Itoh, S. Katho, T. Mano, and K. Toyokawa, “A Document Recognition System and Its Application,” IBM J. Research and Development, vol. 40, no. 3, pp. 341-352, May 1996.
[4] A.K. Jain and B. Yu, “Document Representation and Its Application to Page Decomposition,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 3, pp. 294-308, Mar. 1998.
[5] R.M. Haralick, “Document Image Understanding: Geometric and Logical Layout,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 385-390, 1994.
[6] M. Worring and A.W.M. Smeulders, “Content Based Internet Access to Paper Documents,” Int'l J. Document Analysis and Recognition, vol. 1, no. 4, pp. 209-220, 1999.
[7] Y.Y. Tang, S.W. Lee, and C.Y. Suen, “Automatic Document Processing—A Survey,” Pattern Recognition, vol. 29, no. 12, pp. 1,931-1,952, 1996.
[8] P. Lefevre and F. Reynaud, “ODIL: An SGML Description Language of the Layout Structure of Documents,” Proc. Third Int'l Conf. Document Analysis and Recognition, pp. 480-487, Aug. 1995.
[9] T. Pavlidis and J. Zhou, “Page Segmentation and Classification,” CVGIP: Graphical Models and Image Processing, vol. 54, no. 6, pp. 484-496, Nov. 1992.
[10] G. Nagy, S. Seth, and M. Viswanathan, “A Prototype Document Image Analysis System for Technical Journals,” Computer, vol. 25, no. 7, pp. 10-22, July 1992.
[11] M. Krishnamoorthy, G. Nagy, S. Seth, and M. Viswanathan, “Syntactic Segmentation and Labeling of Digitized Pages from Technical Journals,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 15, no. 7, pp. 737-747, July 1993.
[12] S. Tsujimoto and H. Asada, "Major Components of a Complete Text Reading System," Proceedings IEEE, vol. 80, no. 7, pp. 1,133-1,149, July 1992.
[13] K.C. Fan, C.H. Liu, and Y.K. Wang, “Segmentation and Classification of Mixed Text/Graphics/Image Documents,” Pattern Recognition Letters, vol. 15, pp. 1,201-1,209, 1994.
[14] T. Saitoh, T. Yamaai, and M. Tachikawa, “Document Image Segmentation and Layout Analysis,” IEICE Trans. Information and Systems, vol. E77-D, no. 7, pp. 778-784, July 1994.
[15] D. Wang and S.N. Srihari, “Classification of Newspaper Image Blocks Using Texture Analysis,” Computer Vision, Graphics, and Image Processing, vol. 47, pp. 327-352, 1989.
[16] F. Cesarini, M. Gori, S. Marinai, and G. Soda, “Structured Document Segmentation and Representation by the Modified X-Y Tree,” Proc. Fifth Int'l Conf. Document Analysis and Recognition, pp. 563-566, Sept. 1999.
[17] O. Hitz, L. Robadey, and R. Ingold, “Analysis of Synthetic Document Images,” Proc. Fifth Int'l Conf. Document Analysis and Recognition, pp. 374-377, Sept. 1999.
[18] L. O'Gorman, “The Document Spectrum for Page Layout Analysis,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 15, no. 11, pp. 1,162-1,173, Nov. 1993.
[19] L.A. Fletcher and R. Kasturi, “A Robust Algorithm for Text String Separation from Mixed Text/Graphics Images,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 10, pp. 910-918, Nov. 1988.
[20] A. Zlatopolsky, “Automated Document Segmentation,” Pattern Recognition Letters, vol. 15, pp. 699-704, 1994.
[21] A.K. Jain and S. Bhattacharjee, “Text Segmentation Using Gabor Filters for Automatic Document Processing,” Machine Vision and Applications, vol. 5, pp. 169-184, 1992.
[22] K. Etemad, R. Chellappa, and D. Doermann, “Page Segmentation Using Wavelet Packets and Decision Integration,” Proc. Int'l Conf. Pattern Recognition, vol. 2, pp. 345-349, Oct. 1994.
[23] A. Antonacopoulos and R.T. Ritchings, "Flexible Page Segmentation Using the Background," Proc. IAPR Int'l Conf. Pattern Recognition, pp. 339-344, 1994.
[24] A. Antonacopoulos, “Page Segmentation Using the Description of the Background,” Computer Vision and Image Understanding, vol. 70, no. 3, pp. 350-369, June 1998.
[25] F. Esposito, D. Malerba, and G. Semeraro, “A Knowledge-Based Approach to the Layout Analysis,” Proc. Third Int'l Conf. Document Analysis and Recognition, pp. 466-471, Aug. 1995.
[26] G. Nagy, J. Kanai, M. Krishnamoorthy, M. Thomas, and M. Viswanathan, “Two Complementary Techniques for Digitized Document Analysis,” Proc. ACM Conf. Document Processing Systems, pp. 169-176, Dec. 1988.
[27] A. Dengel and G. Barth, “High Level Document Analysis Guided by Geometric Aspects,” Int'l J. Pattern Recognition and Artificial Intelligence, vol. 2, no. 4, pp. 641-655, 1988.
[28] A. Dengel, R. Bleisinger, R. Hoch, F. Fein, and F. Hönes, “From Paper to Office Document Standard Representation,” Computer, vol. 25, no. 7, pp. 63-67, July 1992.
[29] J. Higashino, H. Fujisawa, Y. Nakano, and M. Ejiri, “A Knowledge-Based Segmentation Method for Document Understanding,” Proc. Eighth Int'l Conf. Pattern Recognition, pp. 745-748, 1986.
[30] J.L. Fisher, S.C. Hinds, and D.P. D'Amato, “A Rule-Based System for Document Image Segmentation,” Proc. 10th Int'l Conf. Pattern Recognition, pp. 567-572, June 1990.
[31] D. Niyogi and S.N. Srihari, “An Integrated Approach to Document Decomposition and Structural Analysis,” Int'l J. Imaging Systems and Technology, vol. 7, pp. 330-342, 1996.
[32] J. Sauvola, M. Pietikainen, and M. Koivusaari, “Predictive Coding for Document Layout Characterization,” Proc. Workshop Document Image Analysis, pp. 44-50, June 1997.
[33] A.M. Nazif and M.D. Levine, “Low Level Image Segmentation: An Expert System,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 6, no. 5, pp. 555-577, Sept. 1984.
[34] J.K. Ha, R.M. Haralick, and I.T. Phillips, “Document Page Decomposition by the Bounding-Box Projection Technique,” Proc. Third Int'l Conf. Document Analysis and Recognition, vol. 2, pp. 1,119-1,122, Aug. 1995.
[35] B. Yu and A.K. Jain, “A Robust and Fat Skew Detection Algorithm for Generic Documents,” Pattern Recognition, vol. 29, no. 10, pp. 1,599-1,629, 1996.
[36] S. Mao and T. Kanungo, “Empirical Performance Evaluation of Page Segmentation Algorithms,” Proc. SPIE Conf. Document Recognition and Retrieval VII, vol. 3,967, pp. 303-314, Jan. 2000.
[37] J. Kanai, “Text Line Extraction and Baseline Detection,” Proc. Conf. Intelligent Text and Image Handling (RIAO '91), pp. 194-210, Apr. 1991.
[38] G. Nagy and S. Seth, “Hierarchical Representation of Optical Scanned Documents,” Proc. Int'l Conf. Pattern Recognition, pp. 347-349, 1984.
[39] K.H. Lee, S.B. Cho, and Y.C. Choy, “Automated Vectorization of Cartographic Maps by a Knowledge-Based System,” Eng. Applications of Artificial Intelligence, vol. 13, no. 2, pp. 165-178, Apr. 2000.
[40] I. Phillips, S. Chen, and R. Haralick, “CD-ROM Document Database Standard,” Proc. Second Int'l Conf. Document Analysis and Recognition, pp. 478-483, 1993.
[41] I. Guyon, R. Haralick, J.J. Hull, and I.T. Phillips, “Data Sets for OCR and Document Image Understanding Research,” Handbook of Character Recognition and Document Image Analysis, H. Bunke and P. Wang, eds., pp. 779-799, Singapore: World Scientific, 1997.
[42] Int'l Organization for Standardization, Information Processing—Text and Office Systems—Standard Generalized Markup Language (SGML), ISO/IEC 8879, 1986.
[43] World Wide Web Consortium, Extensible Markup Language (XML) 1.0, http://www.cfar.umd.edu/~kanungo/pubs/phdthesis.ps.Zhttp:/ /www.x3c.org/TR/1998REC-xml-19980210 , 1998.

Index Terms:
Document image analysis, geometric structure analysis, region segmentation, region identification, knowledge-based approach.
Citation:
Kyong-Ho Lee, Yoon-Chul Choy, Sung-Bae Cho, "Geometric Structure Analysis of Document Images: A Knowledge-Based Approach," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 11, pp. 1224-1240, Nov. 2000, doi:10.1109/34.888708
Usage of this product signifies your acceptance of the Terms of Use.