Ninth International Conference on Document Analysis and Recognition (ICDAR 2007) (2007)
Curitiba, Parana, Brazil
Sept. 23, 2007 to Sept. 26, 2007
A. Jayaraman , Indian Institute of Technology Madras Chennai, India
C.C. Sekhar , Indian Institute of Technology Madras Chennai, India
V.S. Chakravarthy , Indian Institute of Technology Madras Chennai, India
In this paper, we address some issues in developing an online handwritten character recognition(HCR) system for an Indian language script, Telugu. The number of charac- ters in this script is estimated to be around 5000. A char- acter in this script is written as a sequence of strokes. The set of strokes in Telugu consists of 253 unique strokes. As the similarity among several strokes is high, we propose a modular approach for recognition of strokes. Based on the relative position of a stroke in a character, the stroke set has been divided into three subsets, namely, baseline strokes, bottom strokes and top strokes. Classifiers for the differ- ent subsets of strokes are built using support vector ma- chines(SVMs). We study the performance of the classifiers for subsets of strokes and propose methods to improve their performance. A comparative study using hidden Markov models(HMMs) shows that the SVM based approach gives a significantly better performance.
A. Jayaraman, C. Sekhar and V. Chakravarthy, "Modular Approach to Recognition of Strokes in Telugu Script," Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)(ICDAR), Curitiba, Parana, Brazil, 2007, pp. 501-505.