This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Texture for Script Identification
November 2005 (vol. 27 no. 11)
pp. 1720-1732
The problem of determining the script and language of a document image has a number of important applications in the field of document analysis, such as indexing and sorting of large collections of such images, or as a precursor to optical character recognition (OCR). In this paper, we investigate the use of texture as a tool for determining the script of a document image, based on the observation that text has a distinct visual texture. An experimental evaluation of a number of commonly used texture features is conducted on a newly created script database, providing a qualitative measure of which features are most appropriate for this task. Strategies for improving classification results in situations with limited training data and multiple font types are also proposed.

[1] I. Bazzi, R. Schwartz, and J. Makhoul, “An Omnifont Open-Vocabulary OCR System for English and Arabic,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 21, no. 6, pp. 495-504, June 1999.
[2] A.L. Spitz, “Determination of the Script and Language Content of Document Images,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 3, pp. 235-245, Mar. 1997.
[3] C. Suen, N.N. Bergler, B. Waked, C. Nadal, and A. Bloch, “Categorizing Document Images into Script and Language Classes,” Proc. Int'l Conf. Advances in Pattern Recognition, pp. 297-306, 1998.
[4] B. Julesz, “Visual Pattern Discrimination,” IRE Trans. Information Theory, vol. 8, pp. 84-92, 1962.
[5] C. Ronse and P. Devijver, Connected Components in Binary Images: The Detection Problem. Research Studies Press, 1984.
[6] D.S. Lee, C.R. Nohl, and H.S. Baird, “Language Identification in Complex, Unoriented, and Degraded Document Images,” Proc. IAPR Workshop Document Analysis and Systems, pp. 76-98, 1996.
[7] A.L. Spitz and M. Ozaki, “Palace: A Multilingual Document Recognition System,” Proc. Int'l Assoc. for Pattern Recognition Workshop Document Analysis Systems, pp. 16-37, 1995.
[8] J. Hochberg, “Automatic Script Identification from Images Using Cluster-Based Templates,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 2, pp. 176-181, Feb. 1997.
[9] U. Pal and B.B. Chaudhuri, “Automatic Identification of English, Chinese, Arabic Devnagari and Bangla Script Line,” Proc. Sixth Int'l Conf. Document Analysis and Recognition, pp. 790-794, 2001.
[10] G. Peake and T. Tan, “Script and Language Identification from Document Images,” Proc. Workshop Document Image Analysis, vol. 1, pp. 10-17, 1997.
[11] T. Tan, “Rotation Invariant Texture Features and Their Use in Automatic Script Identification,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 7, pp. 751-756, July 1998.
[12] M. Acharyya and M.K. Kundu, “Document Image Segmentation Using Wavelet Scale-Space Features,” IEEE Trans. Circuits and Systems for Video Technology, vol. 12, no. 12, pp. 1117-1127, 2002.
[13] P. Clark and M. Mirmedhi, “Combining Statistical Measures to Find Image Text Regions,” Proc. 15th Int'l Conf. Pattern Recognition, pp. 450-453, 2000.
[14] N. Jin and Y.Y. Tang, “Text Area Localization under Complex-Background Using Wavelet Decomposition,” Proc. Sixth Int'l Conf. Document Analysis and Recognition, pp. 1126-1130, 2001.
[15] H. Li, D. Doermann, and O. Kia, “Automatic Text Detection and Tracking in Digital Video,” IEEE Trans. Image Processing, vol. 9, no. 1, pp. 147-156, 2000.
[16] V. Wu, R. Manmatha, and E.M. Riseman, “Finding Text in Images,” Proc. Second ACM Int'l Conf. Digital Libraries, 1997.
[17] N. Otsu, “A Threshold Selection Method from Gray-Level Histograms,” IEEE Trans. Systems, Man, and Cybernetics, vol. 9, no. 1, pp. 62-66, 1979.
[18] J. Kittler and J. Illingworth, “Minimum Error Thresholding,” Pattern Recognition, vol. 19, pp. 41-47, 1986.
[19] J.N. Kapur, P.K. Sahoo, and A.K. C. Wong, “A New Method for Gray-Level Picture Thresholding Using the Entropy of the Histogram,” Computer Vision, Graphics, and Image Processing, vol. 29, pp. 273-285, 1985.
[20] Y. Liu, R. Fenich, and S.N. Srihari, “An Object Attribute Thresholding Algorithm for Document Image Binarization,” Proc. Int'l Conf. Document Analysis and Recognition, pp. 278-281, 1993.
[21] J. Yang, Y. Chen, and W. Hsu, “Adaptive Thresholding Algorithm and Its Hardware Implementation,” Pattern Recognition Letters, vol. 15, pp. 141-150, 1994.
[22] J. Sauvola and M. Pietikainen, “Adaptive Document Image Binarization,” Pattern Recognition, vol. 33, pp. 225-236, 2000.
[23] Y. Liu and S.N. Srihari, “Document Image Binarization Based on Texture Features,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 5, pp. 540-544, May 1997.
[24] W. Postl, “Detection of Linear Oblique Structures and Skew Scan in Digitized Documents,” Proc. Int'l Conf. Pattern Recognition, pp. 687-689, 1986.
[25] G. Peake and T. Tan, “A General Algorithm for Document Skew Angle Estimation,” Proc. Int'l Conf. Image Processing, vol. 2, pp. 230-233, 1997.
[26] B.B. Chaudhuri and U. Pal, “Skew Angle Detection of Digitized Indian Script Documents,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 2, pp. 703-712, Feb. 1997.
[27] H.S. Baird, “The Skew Angle of Printed Documents,” Document Image Analysis, L. O'Gorman and R. Kasturi, eds., IEEE CS Press, pp. 204-208, 1995.
[28] A. Vailaya, H.J. Zhang, and A.K. Jain, “Automatic Image Orientation Detection,” Proc. Int'l Conf. Image Processing, vol. 2, pp. 600-604, 1999.
[29] S. Lowther, V. Chandran, and S. Sridharan, “An Accurate Method for Skew Determination in Document Images,” Digital Image Computing Techniques and Applications, vol. 1, pp. 25-29, 2002.
[30] R.M. Haralick, K. Shanmugam, and I. Dinstein, “Textural Features for Image Classification,” IEEE Trans. Systems, Man, and Cybernetics, vol. 3, pp. 610-621, 1973.
[31] I. Daubechies, “The Wavelet Transform, Time-Frequency Localization and Signal Analysis,” IEEE Trans. Information Theory, vol. 36, pp. 961-1005, 1990.
[32] T. Chang and C.C. Kuo, “Texture Segmentation with Tree-Structured Wavelet Transform,” Proc. IEEE Int'l Symp. Time-Frequency and Time-Scale Analysis, vol. 2, p. 577 1992.
[33] H. Greenspan, S. Belongie, R. Goodman, and P. Perona, “Rotation Invariant Texture Recognition Using a Steerable Pyramid,” Proc. 12th Int'l Conf. Pattern Recognition, vol. 2, pp. 162-167, 1994.
[34] M. Unser, A. Aldroubi, and M. Eden, “A Family of Polynomial Spline Wavelet Transforms,” Signal Processing, vol. 30, pp. 141-162, 1993.
[35] G. Van de Wouwer, P. Scheunders, and D. Van Dyck, “Statistical Texture Characterization from Discrete Wavelet Representations,” IEEE Trans. Image Processing, vol. 8, no. 4, pp. 592-598, 1999.
[36] A. Busch, W.W. Boles, and S. Sridharan, “Logarithmic Quantization of Wavelet Coefficients for Improved Texture Classification Performance,” IEEE Int'l Conf. Acoustics, Speech, and Signal Processing, 2004.
[37] S.G. Mallat, “Zero-Crossings of a Wavelet Transform,” IEEE Trans. Information Theory, vol. 37, pp. 1019-1033, 1991.
[38] A. Busch and W.W. Boles, “Texture Classification Using Wavelet Scale Relationships,” Proc. IEEE Int'l Conf. Acoustics, Speech, and Signal Processing, vol. 4, pp. 3484-3487, 2002.
[39] R.O. Duda, P.E. Hart, and D.G. Stork, Pattern Classification. New York: John Wiley & Sons, Inc., 2001.
[40] R.E. Kass and A.E. Raftery, “Bayes Factors,” J. Am. Statistical Assoc., vol. 90, pp. 773-795, 1994.
[41] J. Olivier and R. Baxter, “MML and Bayesianism: Similarities and Differences,” Technical Report 206, Monash Univ., Australia, 1994.
[42] G. Schwarz, “Estimating the Dimensionality of a Model,” Ann. Statistics, vol. 6, no. 2, pp. 461-464, 1978.
[43] D.A. Reynolds, “Comparison of Background Normalization Methods for Text-Independent Speaker Verification,” Proc. EUROSPEECH vol. 2, pp. 963-970, 1997.
[44] K. Fukunaga, Introduction to Statistical Pattern Recognition, second ed. San Diego: Academic Press 1990.
[45] C. Lee and J. Gauvain, “Bayesian Adaptive Learning and MAP Estimation of HMM,” Automatic Speech and Speaker Recognition: Advanced Topics, Boston: Kluwer Academic, pp. 83-107, 1996.
[46] C.-H. Lee, C.-H. Lin, and B.-H. Juang, “A Study on Speaker Adaptation of the Parameters of Continuous Density Hidden Markov Models,” IEEE Trans. Acoustics, Speech, and Signal Processing, vol. 39, no. 4, pp. 806-814, 1991.
[47] H.-S. Rhee and K.-W. Oh, “A Validity Measure for Fuzzy Clustering and Its Use in Selecting Optimal Number of Clusters,” Proc. Fifth IEEE Int'l Conf. Fuzzy Systems, vol. 2, pp. 1020-1025, 1996.
[48] K.S. Younis, M.P. DeSimio, and S.K. Rogers, “A New Algorithm for Detecting the Optimal Number of Substructures in the Data,” Proc. IEEE Aerospace and Electronis Conf. , vol. 1, pp. 503-507, 1997.
[49] I. Gath and A.B. Geva, “Unsupervised Optimal Fuzzy Clustering,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 11, no. 7, pp. 773-780, July 1989.

Index Terms:
Index Terms- Script identification, wavelets and fractals, texture, document analysis, clustering, classification and association rules.
Citation:
Andrew Busch, Wageeh W. Boles, Sridha Sridharan, "Texture for Script Identification," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 11, pp. 1720-1732, Nov. 2005, doi:10.1109/TPAMI.2005.227
Usage of this product signifies your acceptance of the Terms of Use.