This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Document Representation and Its Application to Page Decomposition
March 1998 (vol. 20 no. 3)
pp. 294-308

Abstract—Transforming a paper document to its electronic version in a form suitable for efficient storage, retrieval, and interpretation continues to be a challenging problem. An efficient representation scheme for document images is necessary to solve this problem. Document representation involves techniques of thresholding, skew detection, geometric layout analysis, and logical layout analysis. The derived representation can then be used in document storage and retrieval. Page segmentation is an important stage in representing document images obtained by scanning journal pages. The performance of a document understanding system greatly depends on the correctness of page segmentation and labeling of different regions such as text, tables, images, drawings, and rulers. In this paper, we use the traditional bottom-up approach based on the connected component extraction to efficiently implement page segmentation and region identification. A new document model which preserves top-down generation information is proposed based on which a document is logically represented for interactive editing, storage, retrieval, transfer, and logical analysis. Our algorithm has a high accuracy and takes approximately 1.4 seconds on a SGI Indy workstation for model creation, including orientation estimation, segmentation, and labeling (text, table, image, drawing, and ruler) for a 2,550 × 3,300 image of a typical journal page scanned at 300 dpi. This method is applicable to documents from various technical journals and can accommodate moderate amounts of skew and noise.

[1] G. Nagy, S. Seth, and M. Viswanathan, “A Prototype Document Image Analysis System for Technical Journals,” Computer, vol. 25, no. 7, pp. 10-22, July 1992.
[2] A. Dengel, "Initial Learning of Document Structure," Proc. Second Int'l Conf. Document Analysis and Recognition, pp. 86-90,Tsukuba, Japan, 1993.
[3] B. Yu and A.K. Jain, “A Generic System for Form Dropout,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 18, no. 11, pp. 1,127-1,134, Nov. 1996.
[4] B. Yu, A. Jain, and M. Mohiuddin, "Address Block Location on Complex Mail Pieces," Proc. Fourth Int'l Conf. Document Analysis and Recognition,Ulm, Germany, 1997.
[5] B. Couasnon and J. Camillerapp, "A Way to Separate Knowledge From Program in Structured Document Analysis: Application to Optical Music Recognition," Proc. Third Int'l Conf. Document Analysis and Recognition, pp. 1,092-1,097,Montreal, 1995.
[6] B. Yu, “Automatic Understanding of Symbol-Connected Diagrams,” Proc. Int'l Conf. Document Analysis and Recognition, pp. 803-806, 1995.
[7] G. Nagy, “At the Frontiers of OCR,” Proc. IEEE, vol. 80, pp. 1,093-1,100, 1992.
[8] OmniPage Pro for Windows 95 Version 7.0.Los Gatos, Calif.: Caere Corp., 1996.
[9] L. O'Gorman and R. Kasturi, eds., Document Image Analysis.Los Alamitos, Calif.: IEEE CS Press, 1995.
[10] R.M. Haralick, “Document Image Understanding: Geometric and Logical Layout,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 385-390, 1994.
[11] F. Wahl, K. Wong, and R. Casey, "Block Segmentation and Text Extraction in Mixed Text/Image Documents," Computer Vision, Graphics, and Image Processing, vol. 20, pp. 375-390, 1982.
[12] G. Nagy and S. Seth, "Hierarchical Representation of Optically Scanned Documents," Proc. Seventh Int'l Conf. Pattern Recognition, pp. 347-349,Montreal, 1984.
[13] D. Wang and S. Srihari, "Classification of Newspaper Image Blocks Using Texture Analysis," Computer Vision, Graphics, and Image Processing, vol. 47, pp. 327-352, 1989.
[14] H. Fujisawa and Y. Nakano, "A Top-Down Approach for the Analysis of Documents," Proc. 10th Int'l Conf. Pattern Recognition, pp. 113-122,Atlantic City, N.J., 1990.
[15] J.L. Fisher, S.C. Hinds, and D.P. D'Amato, “A Rule-Based System for Document Image Segmentation,” Proc. 10th Int'l Conf. Pattern Recognition, pp. 567-572, June 1990.
[16] T. Pavlidis and J. Zhou, "Page Segmentation by White Streams," Proc. First Int'l Conf. Document Analysis and Recognition, pp. 945-953,Saint-Malo, France, 1991.
[17] H. Baird, "Anatomy of a Versatile Page Reader," Proc. IEEE, vol. 80, pp. 1,059-1,065, 1992.
[18] A. Jain and S. Bhattacharjee, "Text Segmentation Using Gabor Filters for Automatic Document Processing," Machine Vision and Applications, vol. 5, pp. 169-184, 1992.
[19] F. Legourgiois, Z. Bublinski, and H. Emptoz, "A Fast and Efficient Method for Extracting Text Paragraphs and Graphics From Unconstrained Documents," Proc. 11th Int'l Conf. Pattern Recognition, pp. 272-276, The Hague, 1992.
[20] T. Pavlidis and J. Zhou, "Page Segmentation and Classification," CVGIP: Graphical Models and Image Processing, vol. 54, pp. 484-496, 1992.
[21] O. Akindele and A. Belaid, "Page Segmentation by Segment Tracing," Proc. Second Int'l Conf. Document Analysis and Recognition, pp. 341-344,Tsukuba, Japan, 1993.
[22] N. Amamoto, S. Torigoe, and Y. Hirogaki, "Block Segmentation and Text Area Extraction of Vertically/Horizontally Written Document," Proc. Second Int'l Conf. Document Analysis and Recognition, pp. 739-742,Tsukuba, Japan, 1993.
[23] H. Baird and D. Ittner, "Language-Free Layout Analysis," Proc. Second Int'l Conf. Document Analysis and Recognition,Tsukuba, Japan, pp. 336-340, Oct. 1993.
[24] L. O'Gorman, “The Document Spectrum for Page Layout Analysis,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 15, no. 11, pp. 1,162-1,173, Nov. 1993.
[25] A. Antonacopoulos and R.T. Ritchings, "Flexible Page Segmentation Using the Background," Proc. IAPR Int'l Conf. Pattern Recognition, pp. 339-344, 1994.
[26] A. Antonacopoulos and R. Ritchings, "Representation and Classification of Complex-Shaped Printed Regions Using White Tiles," Proc. Third Int'l Conf. Document Analysis and Recognition, pp. 1,132-1,135,Montreal, 1995.
[27] A. Zlatopolsky, "Automated Document Segmentation," Pattern Recognition Letters, vol. 15, pp. 699-704, 1994.
[28] D. Doermann, "Page Decomposition and Related Research," Proc. Symp. Document Image Understanding Technology, pp. 39-55,Bowie, Md., 1995.
[29] D. Drivas and A. Amin, "Page Segmentation and Classification Utilizing Bottom-Up Approach," Proc. Third Int'l Conf. Document Analysis and Recognition, pp. 610-614,Montreal, 1995.
[30] J.K. Ha, R.M. Haralick, and I.T. Phillips, “Document Page Decomposition by the Bounding-Box Projection Technique,” Proc. Third Int'l Conf. Document Analysis and Recognition, vol. 2, pp. 1,119-1,122, Aug. 1995.
[31] D. Sylwester and S. Seth, "A Trainable, Single-Pass Algorithm for Column Segmentation," Proc. Third Int'l Conf. Document Analysis and Recognition, pp. 615-618,Montreal, 1995.
[32] Y. Tang, H. Ma, X. Mao, D. Liu, and C. Suen, "A New Approach to Document Analysis Based on Modified Fractal Signature," Proc. Third Int'l Conf. Document Analysis and Recognition, pp. 567-570,Montreal, 1995.
[33] A.K. Jain and K. Karu, “Learning Texture Discrimination Masks,” IEEE Trans. Pattern Analysis Machine Intelligence, vol. 18, no. 2, pp. 195-205, Feb. 1996.
[34] A. Jain and Y. Zhong, "Page Segmentation Using Texture Analysis," Pattern Recognition, vol. 29, pp. 743-770, 1996.
[35] K. Kise, O. Yanagida, and S. Takamatsu, "Page Segmentation Based on Thinning of Background," Proc. 13th Int'l Conf. Pattern Recognition, pp. 788-792,Vienna, 1996.
[36] J. Liu, Y. Tang, Q. He, and C. Suen, "Adaptive Document Segmentation and Geometric Relation Labeling: Algorithms and Experimental Results," Proc. 13th Int'l Conf. Pattern Recognition, pp. 763-767,Vienna, 1996.
[37] A. Yamashita, T. Amano, Y. Hirayama, N. Itoh, S. Katoh, T. Mano, and K. Toyokawa, "A Document Recognition System and Its Applications," IBM J. Research and Development, vol. 40, pp. 341-352, 1996.
[38] T. Akiyama and N. Hagita, "Automated Entry System for Printed Document Recognition System for Text Entry," Pattern Recognition, vol. 23, pp. 1,141-1,154, 1990.
[39] R. Esposito, D. Malerba, and G. Semeraro, “An Experimental Page Layout Recognition System for Office Document Automatic Classification: An Integrated Approach for Inductive Generalization,” Proc. 10th Int'l Conf. Pattern Recognition (ICPR), pp. 557-562, 1990.
[40] F. Esposito, D. Malerba, and G. Semeraro, "Automated Acquisition of Rules for Document Understanding," Proc. Second Int'l Conf. Document Analysis and Recognition, pp. 650-654,Tsukuba, Japan, 1993.
[41] S. Tsujimoto and H. Asada, "Understanding Multi-Articled Documents," Proc. 10th Int'l Conf. Pattern Recognition, pp. 551-556,Atlantic City, N.J., 1990.
[42] Y. Chenevoy and A. Belaid, "Hypothesis Management for Structured Document Recognition," Proc. First Int'l Conf. Document Analysis and Recognition, pp. 121-129,Saint-Malo, France, 1991.
[43] D. Derrien-Peden, "Frame-Based System for Macro-Typographical Structure Analysis in Scientific Papers," Proc. First Int'l Conf. Document Analysis and Recognition, pp. 311-319,Saint-Malo, France, 1991.
[44] R. Ingold and D. Armangil, "A Top-Down Document Analysis Method for Logical Structure Recognition," Proc. First Int'l Conf. Document Analysis and Recognition, pp. 41-49,Saint-Malo, France, 1991.
[45] J. Kreich, A. Luhn, and G. Maderlechner, "An Experimental Environment for Model Based Document Analysis," Proc. First Int'l Conf. Document Analysis and Recognition, pp. 50-58,Saint-Malo, France, 1991.
[46] A. Yamashita, T. Amasno, H. Takahashi, and K. Toyokawa, "A Model Based Layout Understanding Method for the Document Recognition System," Proc. First Int'l Conf. Document Analysis and Recognition, pp. 130-138,Saint-Malo, France, 1991.
[47] M. Krishnamoorthy, G. Nagy, S. Seth, and M. Viswanathan, “Syntactic Segmentation and Labeling of Digitized Pages from Technical Journals,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 15, no. 7, pp. 737-747, July 1993.
[48] F. Esposito, D. Malerba, and G. Semeraro, “A Knowledge-Based Approach to the Layout Analysis,” Proc. Third Int'l Conf. Document Analysis and Recognition, pp. 466-471, Aug. 1995.
[49] D. Niyogi and S. Srihari, "Knowledge-Based Derivation of Document Logical Structure," Proc. Third Int'l Conf. Document Analysis and Recognition, pp. 472-475,Montreal, 1995.
[50] R. Sivaramakrishnan, I. Phillips, J. Ha, S. Subramanium, and R. Haralick, "Zone Classification in a Document Using the Method of Feature Vector Generation," Proc. Third Int'l Conf. Document Analysis and Recognition, pp. 541-544,Montreal, 1995.
[51] J. Payne, T. Stonham, and D. Patel, "Document Segmentation Using Texture Analysis," Proc. 12th Int'l Conf. Pattern Recognition, pp. 380-382,Jerusalem, 1994.
[52] J. Ha, R. Haralick, and I. Phillips, "Recursive X-Y Cut Using Bounding Boxes of Connected Components," Proc. Third Int'l Conf. Document Analysis and Recognition, pp. 952-955,Montreal, 1995.
[53] G. Nagy, Toward a Structured Document Image Utility.Berlin: Springer Verlag, 1992.
[54] J. Fisher, "Logical Structure Descriptions of Segmented Document Images," Proc. First Int'l Conf. Document Analysis and Recognition, pp. 302-310,Saint-Malo, France, 1991.
[55] B. Yu, X. Lin, Y. Wu, and B. Yuan, "Isothetic Polygon Representation for Contours," CVGIP: Image Understanding, vol. 56, pp. 264-268, 1992.
[56] S. Di Zenzo, L. Cinque, and S. Levialdi, “Run-Based Algorithms for Binary Image Analysis and Processing,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 18, no. 1, pp. 83-88, Jan. 1996.
[57] B. Yu and A. Jain, "A Robust and Fast Skew Detection Algorithm for Generic Documents," Pattern Recognition, vol. 29, pp. 1,599-1,629, 1996.
[58] K. Fan and L. Wang, "Classification of Document Blocks Using Density Feature and Connectivity Histogram," Pattern Recognition Letters, vol. 16, pp. 955-962, 1995.
[59] D. Chetverikov, J. Liang, J. Komuves, and R. Haralick, "Zone Classification Using Texture Features," Proc. 13th Int'l Conf. Pattern Recognition, pp. 676-680,Vienna, 1996.
[60] I. Phillips, S. Chen, and R. Haralick, “CD-ROM Document Database Standard,” Proc. Second Int'l Conf. Document Analysis and Recognition, pp. 478-483, 1993.
[61] J. Kanai, S.V. Rice, T.A. Nartker, and G. Nagy, “Automated Evaluation of OCR Zoning” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 17, no. 1, pp. 86-89, Jan. 1995.
[62] S. Randriamasy and L. Vincent, "Benchmarking Page Segmentation Algorithms," Proc. CVPR, pp. 441-416, 1994.

Index Terms:
Document model, document storage and retrieval, page segmentation, region identification, document image analysis.
Citation:
Anil K. Jain, Bin Yu, "Document Representation and Its Application to Page Decomposition," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, no. 3, pp. 294-308, March 1998, doi:10.1109/34.667886
Usage of this product signifies your acceptance of the Terms of Use.