This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Parameter-Free Geometric Document Layout Analysis
November 2001 (vol. 23 no. 11)
pp. 1240-1256

Abstract—Automatic transformation of paper documents into electronic documents requires geometric document layout analysis at the first stage. However, variations in character font sizes, text line spacing, and document layout structures have made it difficult to design a general-purpose document layout analysis algorithm for many years. The use of some parameters has therefore been unavoidable in previous methods. In this paper, we propose a parameter-free method for segmenting the document images into maximal homogeneous regions and identifying them as texts, images, tables, and ruling lines. A pyramidal quadtree structure is constructed for multiscale analysis and a periodicity measure is suggested to find a periodical attribute of text regions for page segmentation. To obtain robust page segmentation results, a confirmation procedure using texture analysis is applied to only ambiguous regions. Based on the proposed periodicity measure, multiscale analysis, and confirmation procedure, we could develop a robust method for geometric document layout analysis independent of character font sizes, text line spacing, and document layout structures. The proposed method was experimented with the document database from the University of Washington and the MediaTeam Document Database. The results of these tests have shown that the proposed method provides more accurate results than the previous ones.

[1] F. Legourgiois, Z. Bublinski, and H. Emptoz, "A Fast and Efficient Method for Extracting Text Paragraphs and Graphics From Unconstrained Documents," Proc. 11th Int'l Conf. Pattern Recognition, pp. 272-276, The Hague, 1992.
[2] D. Drivas and A. Amin, "Page Segmentation and Classification Utilizing Bottom-Up Approach," Proc. Third Int'l Conf. Document Analysis and Recognition, pp. 610-614,Montreal, 1995.
[3] A. Simon, J.-C. Pret, and A.P. Johnson, “A Fast Algorithm for Bottom-Up Layout Analysis,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 3, pp. 273-277, Mar. 1997.
[4] J. Ha, R. Haralick, and I. Phillips, "Recursive X-Y Cut Using Bounding Boxes of Connected Components," Proc. Third Int'l Conf. Document Analysis and Recognition, pp. 952-955,Montreal, 1995.
[5] J.K. Ha, R.M. Haralick, and I.T. Phillips, “Document Page Decomposition by the Bounding-Box Projection Technique,” Proc. Third Int'l Conf. Document Analysis and Recognition, vol. 2, pp. 1,119-1,122, Aug. 1995.
[6] A. Jain and Y. Zhong, “Page Segmentation Using Texture Analysis,” Pattern Recognition, vol. 29, pp. 743-770, 1996.
[7] A.K. Jain and S. Bhattacharjee, “Text Segmentation Using Gabor Filters for Automatic Document Processing,” Machine Vision and Applications, vol. 5, pp. 169-184, 1992.
[8] H. Cheng and C.A. Bouman, “Trainable Context Model for Multiscale Segmentation,” Proc. Int'l Conf. Image Processing (ICIP '98), vol. 1, pp. 610-614, 1998.
[9] K. Etemad, D. Doerman, and R. Chellappa, “Multiscale Segmentation of Unstructured Document Pages Using Soft Decision Integration,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 1, pp. 92-96, Jan. 1997.
[10] A. Antonacopoulos, “Page Segmentation Using the Description of the Background,” Computer Vision and Image Understanding, vol. 70, no. 3, pp. 350-369, June 1998.
[11] A.K. Jain and B. Yu, “Document Representation and Its Application to Page Decomposition,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 3, pp. 294-308, Mar. 1998.
[12] S.G. Mallat,“A theory for multiresolution signal decomposition: The wavelet representation,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 11, no. 7, pp. 674-693, 1989.
[13] Y.T Tang, H. Ma, Y. Liu, B.F. Li, and D. Xi, “Multiresolution Analysis in Extraction of Reference Lines from Documents with Gray Level Background,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 8, pp. 921-925, Aug. 1997.
[14] B. Gatos, N. Papamarkos, and C. Chamzas, “Skew Detection and Text Line Position Determination in Digitized Documents,” Pattern Recognition, vol. 30, pp. 1505-1519, 1997.
[15] Armi Professional for Windows 95/98 and NT Version 5.0, Hapsan Computer Inc., Seoul, Korea, 1999.
[16] OmniPage Pro for Windows 95/98 and NT Version 9.0, Caere Corporation, Los Gatos, Calif., 1998.
[17] I. Phillips, S. Chen, and R. Haralick, “CD-ROM Document Database Standard,” Proc. Second Int'l Conf. Document Analysis and Recognition, pp. 478-483, 1993.
[18] J. Sauvola and H. Kauniskangas, “MediaTeam Document Database,” A CD-ROM Collection of Document Images, Univ. of Oulu, Finland, 1999.
[19] R.M. Haralick, “Document Image Understanding: Geometric and Logical Layout,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 385-390, 1994.

Index Terms:
Geometric document layout analysis, parameter-free method, periodicity estimation, multiscale analysis, page segmentation.
Citation:
Seong-Whan Lee, Dae-Seok Ryu, "Parameter-Free Geometric Document Layout Analysis," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 11, pp. 1240-1256, Nov. 2001, doi:10.1109/34.969115
Usage of this product signifies your acceptance of the Terms of Use.