This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
On the Dependence of Handwritten Word Recognizers on Lexicons
December 2002 (vol. 24 no. 12)
pp. 1553-1564

Abstract—The performance of any word recognizer depends on the lexicon presented. Usually, large lexicons or lexicons containing similar entries pose difficulty for recognizers. However, the literature lacks any quantitative methodology of capturing the precise dependence between word recognizers and lexicons. This paper presents a performance model that views word recognition as a function of character recognition and statistically "discovers" the relation between a word recognizer and the lexicon. It uses model parameters that capture a recognizer's ability of distinguishing characters (of the alphabet) and its sensitivity to lexicon size. These parameters are determined by a multiple regression model which is derived from the performance model. Such a model is very useful in comparing word recognizers by predicting their performance based on the lexicon presented. We demonstrate the performance model with extensive experiments on five different word recognizers, thousands of images, and tens of lexicons. The results show that the model is a good fit not only on the training data but also in predicting the recognizers' performance on testing data.

[1] M.Y. Chen, A. Kundu, and S.N. Srihari, “Variable Duration Hidden Markov Model and Morphological Segmentation for Handwritten Word Recognition,” IEEE Trans. Image Processing, vol. 4, no. 12, pp. 1675-1688, Dec. 1995.
[2] G. Dzuba, A. Filatov, D. Gershuny, and I. Kil, “Handwritten Word Recognition—The Approach Proved by Practice,” Proc. Sixth Int'l Workshop Frontiers in Handwriting Recognition, pp. 99-111, 1998.
[3] A. El-Yacoubi, M. Gilloux, R. Sabourin, and C.Y. Suen, “An HMM-Based Approach for Off-Line Unconstrained Handwritten Word Modeling and Recognition,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 21, no. 8, pp. 752-760, Aug. 1999.
[4] J. Favata, “Character Model Word Recognition,” Proc. Fifth Int'l Workshop Frontiers in Handwriting Recognition, pp. 437-440, Sept. 1996.
[5] G. Kim and V. Govindaraju, “A Lexicon Driven Approach to Handwritten Word Recognition for Real Time Applications,“ IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 4, pp. 366-379, Apr. 1997.
[6] M. Mohammed and P. Gader, “Handwritten Word Recognition Using Segmentation-Free Hidden Markov Modeling and Segmentation-Based Dynamic Programming Techniques,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 18, no. 5, pp. 548-554, May 1996.
[7] U. Marti and H. Bunke, “On the Influence of Vocabulary Size and Language Models in Unconstrainted Handwritten Text Recognition,” Proc. Sixth Int'l Conf. Document Analysis and Recognition, pp. 260-265, Sept. 2001.
[8] J. Park and V. Govindaraju, “Using Lexical Similarity in Handwritten Word Recognition,” IEEE Conf. Computer Vision and Pattern Recognition, 2000.
[9] G. Seni, V. Kripasundar, and R. Srihari, “Generalizing Edit Distance to Incorporate Domain Information,” Pattern Recognition, vol. 29, no. 3, pp. 405-414, 1996.
[10] P. Slavik and V. Govindaraju, “Use of Lexicon Density in Evaluating Word Recognizers,” Multiple Classifier Systems, pp. 310-319, June 2000.
[11] L.R. Bahl, F. Jelinek, and R.L. Mercer, “A Maximum Likelihood Approach to Continuous Speech Recognition,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 5, Mar. 1983.
[12] Survey of the State of the Art in Human Language Technology, R.A. Cole, J. Mariani, H. Uszkoreit, A. Zaenen, and V. Zue, eds., Cambridge Univ. Press, 1998.
[13] F. Grandidier, R. Sabourin, A. E. Yacoubi, M. Gilloux, and C.Y. Suen, “Influence of Word Length on Handwriting Recognition,” Proc. Fifth Int'l Conf. Document Analysis and Recognition, pp. 777-780, Sept. 1999.
[14] H.S. Baird, Structured Document Image Analysis, Document Image Defect Models pp. 546-556, Springer-Verlag, 1992.
[15] H.S. Baird, “State of the Art of Document Image Degradation Modeling,” IAPR Workshop Document Analysis Systems, Dec. 2000.
[16] T.K. Ho, “Random Decision Forests,” Proc. Third Int'l Conf. Document Analysis and Recognition, pp. 278-282, 1995.
[17] T.K. Ho and H.S. Baird, “Large-Scale Simulation Studies in Image Pattern Recognition,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 10, pp. 1,067-1,079, Oct. 1997.
[18] V. Govindaraju, P. Slavik, and H. Xue, “Use of Lexicon Density in Evaluating Word Recognizers,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 6, pp. 789-800, June 2002.
[19] V. Levenshtein, “Binary Codes Capable of Correcting Deletions, Insertions and Reversals,” Soviet Physics–Doklady, vol. 10, no. 8, pp. 707-710, 1966.
[20] S. Levinson, L. Rabiner, and M. Sondhi, “An Introduction to the Application of the Theory of Probabilistic Functions of a Markov Process to Automatic Speech Recognition,” AT&T Technical J., vol. 62, no. 4, pp. 1035-1074, 1983.
[21] B. Juang and L. Rabiner, “A Probabilistic Distance Measure for Hidden Markov Models,” AT&T Tech. J., vol. 64, no. 2, pp. 391-408, 1985.
[22] C. Bahlmann and H. Burkhardt, Measuring HMM Similarity with the Bayes Probability of Error and Its Application to Online Handwriting Recognition Proc. Sixth Int'l Conf. Document Analysis and Recognition, pp. 406-411, 2001.
[23] M. Abramowtiz and I. Stegun, Handbook of Mathematical Functions. New York: Dover, 1964.
[24] P. Slavik and V. Govindaraju, “An Overview of Run-Length Encoding of Handwritten Word Images,” Technical Report 09, State Univ. at New York at Buffalo, Aug. 2000.
[25] H. Xue and V. Govindaraju, “Building Skeletal Graphs for Structural Feature Extraction on Handwriting Images,” Proc. Sixth Int'l Conf. Document Analysis and Recognition, Sept. 2001.
[26] M. Chen, “Handwritten Word Recognition Using Hidden Markov Models.” PhD thesis, State Univ. of NY at Buffalo, Sept. 1993.

Index Terms:
Handwriting recognition, word recognition, performance prediction, performance model, multiple regression.
Citation:
Hanhong Xue, Venu Govindaraju, "On the Dependence of Handwritten Word Recognizers on Lexicons," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 12, pp. 1553-1564, Dec. 2002, doi:10.1109/TPAMI.2002.1114848
Usage of this product signifies your acceptance of the Terms of Use.