This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Multiscale Segmentation of Unstructured Document Pages Using Soft Decision Integration
January 1997 (vol. 19 no. 1)
pp. 92-96

Abstract—We present an algorithm for layout-independent document page segmentation based on document texture using multiscale feature vectors and fuzzy local decision information. Multiscale feature vectors are classified locally using a neural network to allow soft/fuzzy multi-class membership assignments. Segmentation is performed by integrating soft local decision vectors to reduce their "ambiguities."

[1] L.A. Fletcher and R. Kasturi, “A Robust Algorithm for Text String Separation from Mixed Text/Graphics Images,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 10, pp. 910-918, Nov. 1988.
[2] D. Wang and S.N. Srihari, "Classification of Newspaper Image Blocks Using Texture Analysis," CVGIP, vol. 47, pp. 327-352, 1989.
[3] M. Viswanathan and G. Nagy, "Characteristics of Digitized Images of Technical Articles," SPIE, vol. 1,661, pp. 6-17, 1992.
[4] F.M. Wahl, K.Y. Wong, and R.G. Casey, "Block Segmentation and Text Extraction in Mixed Text Image Documents," CVGIP, vol. 20, pp. 375-390, 1982.
[5] T. Pavlidis and J. Zhou, "Page Segmentation and Classification," CVGIP, vol. 54, pp. 484-496, 1992.
[6] T. Pavlidis, "Page Segmentation by White Streams," Proc. Int'l Conf. Document Analysis and Recognition, pp. 945-953, 1991.
[7] A. Antonacopoulos and R.T. Ritchings, "Flexible Page Segmentation Using the Background," Proc. IAPR Int'l Conf. Pattern Recognition, pp. 339-344, 1994.
[8] A.K. Jain and S. Bhattacharjee, "Text Segmentation Using Gabor Filers for Automatic Document Processing," Machine Vision and Applications, vol. 5, pp. 169-184, 1992.
[9] I. Daubechie, "Orthonormal Basis of Compactly Supported Wavelets," Comm. Pure Appl. Math., vol. 41, pp. 909-996, 1988.
[10] R.R. Coifman and M.V. Wickerhauser, "Entropy Based Algorithms for Best Basis Selection," IEEE Trans. Information Theory, vol. 38, pp. 713-718, 1992.
[11] K. Fukunaga, Introduction to Statistical Pattern Recognition, 2nd ed. Boston: Academic Press, 1990.
[12] K. Etemad and R. Chellappa, "Separability Based Tree Structured Basis Selection for Textures Classification," Proc. Int'l Conf. Image Processing, pp. 441-445 1994.
[13] J.C. Bezdek and S.K. Pal, eds., Fuzzy Models and Pattern Recognition.New York: IEEE Press, 1992.
[14] University of Washington English Document Database (CD-ROM), 1993.
[15] B. Kosko, Neural Networks and Fuzzy Systems.Englewood Cliffs, N.J.: Prentice Hall, 1992.
[16] S. Randriamasy and L. Vincent, "Benchmarking Page Segmentation Algorithms," Proc. CVPR, pp. 441-416, 1994.
[17] B.A. Yanikoglu and L. Vincent, "Ground-Truthing and Benchmarking Document Page Segmentation," Proc. ICDAR, pp. 601-604, 1995.

Index Terms:
Document processing, multiscale analysis, context dependent classification, soft decision integration, wavelet packets, neural networks.
Citation:
Kamran Etemad, David Doermann, Rama Chellappa, "Multiscale Segmentation of Unstructured Document Pages Using Soft Decision Integration," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 1, pp. 92-96, Jan. 1997, doi:10.1109/34.566817
Usage of this product signifies your acceptance of the Terms of Use.