This Article 
 Bibliographic References 
 Add to: 
Online Handwritten Script Recognition
January 2004 (vol. 26 no. 1)
pp. 124-130

Abstract—Automatic identification of handwritten script facilitates many important applications such as automatic transcription of multilingual documents and search for documents on the Web containing a particular script. The increase in usage of handheld devices which accept handwritten input has created a growing demand for algorithms that can efficiently analyze and retrieve handwritten data. This paper proposes a method to classify words and lines in an online handwritten document into one of the six major scripts: Arabic, Cyrillic, Devnagari, Han, Hebrew, or Roman. The classification is based on 11 different spatial and temporal features extracted from the strokes of the words. The proposed system attains an overall classification accuracy of 87.1 percent at the word level with 5-fold cross validation on a data set containing 13,379 words. The classification accuracy improves to 95 percent as the number of words in the test sample is increased to five, and to 95.5 percent for complete text lines consisting of an average of seven words.

[1] A History of PDAs, ginning.htm , 2003.
[2] Pen Computing Magazine: PenWindows, , 2003.
[3] Smart Technologies Inc. Homepage, http:/, 2003.
[4] IBM ThinkPad TransNote, , 2003.
[5] Windows XP Tablet PC Edition Homepage, default.asp, 2003.
[6] J.J. Lee and J.H. Kim, “A Unified Network-Based Approach for Online Recognition of Multi-Lingual Cursive Handwritings,” Proc. Fifth Int'l Workshop Frontiers in Handwriting Recognition, pp. 393-397, Sept. 1996.
[7] F. Coulmas, The Blackwell Encyclopedia of Writing Systems. Malden, Mass.: Blackwell Publishers, 1999.
[8] H. Jensen, Sign, Symbol, and Script: An Account of Man's Effort to Write. third ed. London: George Allen and Unwin, 1970.
[9] L.K. Lo, http:/, 2003.
[10] A. Nakanishi, Writing Systems of the World. Tokyo: Charles E. Tuttle Company, 1999.
[11] J. Hochberg, P. Kelly, T. Thomas, and L. Kerns, “Automatic Script Identification from Document Images Using Cluster-Based Templates,” Proc. Third Int'l Conf. Document Analysis and Recognition, pp. 378-381, Aug. 1995.
[12] A.L. Spitz, “Determination of the Script and Language Content of Document Images,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 3, pp. 235-245, Mar. 1997.
[13] A.K. Jain and Y. Zhong, “Page Segmentation Using Texture Analysis,” Pattern Recognition, vol. 29, pp. 743-770, May 1996.
[14] U. Pal and B.B. Chaudhuri, “Script Line Separation from Indian Multi-Script Documents,” Proc. Fifth Int'l Conf. Document Analysis and Recognition, Sept. 1999.
[15] C.Y. Suen, S. Bergler, N. Nobile, B. Waked, C.P. Nadal, and A. Bloch, “Categorizing Document Images Into Script and Language Classes,” Proc. Int'l Conf. Advances in Pattern Recognition, pp. 297-306, Nov. 1998.
[16] C.L. Tan, P.Y. Leong, and S. He, “Language Identification in Multilingual Documents,” Proc. Int'l Symp. Intelligent Multimedia and Distance Education, Aug. 1999.
[17] G.S. Peake and T.N. Tan, “Script and Language Identification from Document Images,” Proc. Third Asian Conf. Computer Vision, pp. 96-104, Jan. 1998.
[18] J. Hochberg, K. Bowers, M. Cannon, and P. Kelly, “Script and Language Identification for Handwritten Document Images,” Int'l J. Document Analysis and Recognition, vol. 2, pp. 45-52, Feb. 1999.
[19] IBM Pen Technologies,, 2003.
[20] E.H. Ratzlaff, “Inter-Line Distance Estimation and Text Line Extraction for Unconstrained Online Handwriting,” Proc. Seventh Int'l Workshop Frontiers in Handwriting Recognition, Sept. 2000.
[21] The Art of Arabic Calligraphy, http://www.sakkal.comArtArabicCalli graphy.html , 2003.
[22] M.T. Figueiredo and A.K. Jain, “Unsupervised Selection and Estimation of Finite Mixture Models,” Proc. 15th Int'l Conf. Pattern Recognition, pp. 87-90, Sept. 2000.
[23] A.K. Jain and D. Zongker, “Feature-Selection: Evaluation, Application, and Small Sample Performance,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 2, pp. 153-158, Feb. 1997.
[24] R. Duda, P. Hart, and D. Stork, Pattern Classifcation and Scene Analysis. second ed. New York: John Wiley and Sons, 2001.

Index Terms:
Document understanding, handwritten script identification, online document, evidence accumulation, feature design.
Anoop M. Namboodiri, Anil K. Jain, "Online Handwritten Script Recognition," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, no. 1, pp. 124-130, Jan. 2004, doi:10.1109/TPAMI.2004.10009
Usage of this product signifies your acceptance of the Terms of Use.