The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.03 - March (2009 vol.31)
pp: 444-457
Ujjwal Bhattacharya , Indian Statistical Institute, Kolkata
B.B. Chaudhuri , Indian Statistical Institute, Kolkata
ABSTRACT
This article primarily concerns the problem of isolated handwritten numeral recognition of major Indian scripts. The principal contributions presented here are (a) pioneering development of two databases for handwritten numerals of two most popular Indian scripts, (b) a multistage cascaded recognition scheme using wavelet based multiresolution representations and multilayer perceptron classifiers and (c) application of (b) for the recognition of mixed handwritten numerals of three Indian scripts Devanagari, Bangla and English. The present databases include respectively 22,556 and 23,392 handwritten isolated numeral samples of Devanagari and Bangla collected from real-life situations and these can be made available free of cost to researchers of other academic Institutions. In the proposed scheme, a numeral is subjected to three multilayer perceptron classifiers corresponding to three coarse-to-fine resolution levels in a cascaded manner. If rejection occurred even at the highest resolution, another multilayer perceptron is used as the final attempt to recognize the input numeral by combining the outputs of three classifiers of the previous stages. This scheme has been extended to the situation when the script of a document is not known a priori or the numerals written on a document belong to different scripts. Handwritten numerals in mixed scripts are frequently found in Indian postal mails and table-form documents.
INDEX TERMS
Handwriting analysis, Optical character recognition
CITATION
Ujjwal Bhattacharya, B.B. Chaudhuri, "Handwritten Numeral Databases of Indian Scripts and Multistage Recognition of Mixed Numerals", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.31, no. 3, pp. 444-457, March 2009, doi:10.1109/TPAMI.2008.88
REFERENCES
[1] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-Based Learning Applied to Document Recognition,” Proc. IEEE, vol. 86, no. 11, pp. 2278-2324, 1998.
[2] J.J. Hull, “A Database for Handwritten Text Recognition Research,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 16, pp. 550-554, 1994.
[3] T. Saito, H. Yamada, and K. Yamamoto, “On the Database ELT9 of Handprinted Characters in JIS Chinese Characters and Its Analysis (in Japanese),” Trans. IECEJ, vol. J.68-D, no. 4, pp. 757-764, 1985.
[4] Y. Al-Ohali, M. Cheriet, and C. Suen, “Databases for Recognition of Handwritten Arabic Cheques,” Pattern Recognition, vol. 36, pp.111-121, 2003.
[5] T. Noumi, T. Matsui, I. Yamashita, T. Wakahara, and T. Tsutsumida, “Tegaki Suji Database 'IPTP CD-ROM1' no ichi bunseki (in Japanese),” Autumn Meeting of IEICE, vol. D-309, Sept. 1994.
[6] C.Y. Suen, M. Berthod, and S. Mori, “Automatic Recognition of Handprinted Characters—The State of the Art,” Proc. IEEE, vol. 68, no. 4, pp. 469-487, 1980.
[7] S.N. Srihari, E. Cohen, J.J. Hull, and L. Kuan, “A System to Locate and Recognize ZIP Codes in Handwritten Addresses,” Int'l J. Research and Eng.-Postal Applications, vol. 1, pp. 37-45, 1989.
[8] J. Tsukumo and H. Tanaka, “Classification of Handprinted Chinese Characters Using Nonlinear Normalization Methods,” Proc. Ninth Int'l Conf. Pattern Recognition, pp. 168-171, 1988.
[9] A. Amin and H.B. Al-Sadoun, “Hand Printed Arabic Character Recognition System,” Proc. 12th Int'l Conf. Pattern Recognition, pp.536-539, 1994.
[10] S.W. Lee and J.S. Park, “Nonlinear Shape Normalization Methods for the Recognition of Large-Set Handwritten Characters,” Pattern Recognition, vol. 27, pp. 895-902, 1994.
[11] H. Yamada, K. Yamamoto, and T. Saito, “A Non-Linear Normalization Method for Handprinted Kanji Character Recognition— Line Density Equalization,” Pattern Recognition, vol. 23, pp. 1023-1029, 1990.
[12] O.D. Trier, A.K. Jain, and T. Taxt, “Feature Extraction Methods for Character Recognition—A Survey,” Pattern Recognition, vol. 29, pp. 641-662, 1996.
[13] I.-S. Oh and C.Y. Suen, “Distance Features for Neural Network-Based Recognition of Handwritten Characters,” Int'l J. Document Analysis and Recognition, vol. 1, no. 2, pp. 73-88, 1998.
[14] R. Plamondon and S.N. Srihari, “On-Line and Off-Line Handwriting Recognition: A Comprehensive Survey,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, pp. 63-84, 2000.
[15] N. Arica and F. Yarman-Vural, “An Overview of Character Recognition Focused on Off-Line Handwriting,” IEEE Trans. Systems, Man, and Cybernetics—Part C: Applications and Rev., vol. 31, pp. 216-233, 2001.
[16] F. Kimura, K. Takashina, S. Tsuruoka, and Y. Miyake, “Modified Quadratic Discriminant Functions and the Application to Chinese Character Recognition,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 9, no. 1, pp. 149-153, 1987.
[17] D.E. Rumelhart, G.E. Hinton, and R.J. Williams, “Learning Internal Representations by Error Propagation,” Inst. for Cognitive Science Report 8506, Univ. of California, 1985.
[18] C.M. Bishop, Neural Networks for Pattern Recognition. Clarendon, 1995.
[19] J.S.U. Kreßel, Pattern Classification Techniques Based on Function Approximation, pp. 49-78. World Scientific Publishing, 1997.
[20] T. Kohonen, “Improved Versions of Learning Vector Quantization,” Proc. Int'l Joint Conf. Neural Networks, vol. 1, pp. 545-550, 1990.
[21] C.J.C. Burges, “A Tutorial on Support Vector Machines for Pattern Recognition,” Knowledge Discovery and Data Mining, vol. 2, no. 2, p.143, 1998.
[22] Y. LeCun, L. Jackel, L. Bottou, A. Brunot, C. Cortes, J. Denker, H. Drucker, I. Guyon, U. Müller, P.S.E. Säckinger, and V. Vapnik, “Comparison of Learning Algorithms for Handwritten Digit Recognition,” Proc. Int'l Conf. Artificial Neural Networks, p. 5360, 1995.
[23] C.-L. Liu, H. Sako, and H. Fujisawa, “Performance Evaluation of Pattern Classifiers for Handwritten Character Recognition,” Int'l J. Document Analysis and Recognition, vol. 4, no. 3, pp. 191-204, 2002.
[24] C.-L. Liu, K. Nakashima, H. Sako, and H. Fujisawa, “Handwritten Digit Recognition: Benchmarking of State-of-the-Art Techniques,” Pattern Recognition, vol. 36, pp. 2271-2285, 2003.
[25] F. Kimura, Y. Miyake, and M. Sridhar, “Handwritten ZIP Code Recognition Using Lexicon Free Word Recognition Algorithm,” Proc. Third Int'l Conf. Document Analysis and Recognition, vol. 2, pp.906-910, 1995.
[26] H. Liu and X. Ding, “Handwritten Character Recognition Using Gradient Feature and Quadratic Classifier with Multiple Discrimination Schemes,” Proc. Eighth Int'l Conf. Document Analysis and Recognition, pp. 19-25, 2005.
[27] M. Shi, Y. Fujisawa, T. Wakabayashi, and F. Kimura, “Handwritten Numeral Recognition Using Gradient and Curvature of Gray Scale Image,” Pattern Recognition, vol. 35, no. 10, pp. 2051-2059, 2002.
[28] A. Shustorovich, “A Subspace Projection Approach to Feature Extraction: The Two-Dimensional Gabor Transform for Character Recognition,” Neural Networks, vol. 7, no. 8, pp. 1295-1301, 1994.
[29] L. Heutte, T. Paquet, J.V. Moreau, Y. Lecourtier, and C. Olivier, “A Structural/Statistical Feature Based Vector for Handwritten Character Recognition,” Pattern Recognition Letters, vol. 19, no. 7, pp. 629-641, 1998.
[30] J. Favata, G. Srikantan, and S. Srihari, “Handprinted Character/Digit Recognition Using a Multiple Feature/Resolution Philosophy,” Proc. Fourth Int'l Workshop Frontiers in Handwriting Recognition, pp. 57-66, 1994.
[31] J. Park, V. Govindaraju, and S.N. Srihari, “OCR in a Hierarchical Feature Space,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 4, pp. 400-407, Apr. 2000.
[32] J. Cao, M. Ahmadi, and M. Shridhar, “Recognition of Handwritten Numerals with Multiple Features and Multi-Stage Classifiers,” Pattern Recognition, vol. 28, no. 2, pp. 153-160, 1995.
[33] C.Y. Suen and J. Tan, “Analysis of Errors of Handwritten Digits Made by a Multitude of Classifiers,” Pattern Recognition Letters, vol. 26, no. 1, pp. 369-379, 2005.
[34] N. Giusti, F. Masuli, and A. Sperduti, “Theoretical and Experimental Analysis of a Two-Stage System for Classification,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 7, pp.893-904, July 2002.
[35] C.M. Nunes, A.de.S. Britto Jr., C.A.A. Kaestner, and R. Sabourin, “An Optimized Hill Climbing Algorithm for Feature Subset Selection Evaluation on Handwritten Character Recognition,” Proc. Ninth Int'l Workshop Frontiers in Handwriting Recognition, pp.365-370, 2004.
[36] U. Bhattacharya, T.K. Das, and B.B. Chaudhuri, “A Cascaded Scheme for Recognition of Handprinted Numerals,” Proc. Third Indian Conf. Computer Vision, Graphics and Image Processing, pp.137-142, 2002.
[37] U. Bhattacharya and B.B. Chaudhuri, “Fusion of Combination Rules of an Ensemble of MLP Classifiers for Improved Recognition Accuracy of Handprinted Bangla Numerals,” Proc. Eighth Int'l Conf. Document Analysis and Recognition, vol. 1, pp. 322-326, 2005.
[38] F.F. Soulie, E. Vinnet, and B. Lamy, “Multi-Modular Neural Network Architectures: Application in Optical Character Recognition and Human Face Recognition,” Int'l J. Pattern Recognition and Artificial Intelligence, vol. 5, pp. 721-725, 1993.
[39] I.K. Sethi and B. Chatterjee, “Machine Recognition of Constrained Handprinted Devanagari,” Pattern Recognition, vol. 9, pp. 69-75, 1977.
[40] K.R. Ramakrishnan, S.H. Srinivasan, and S. Bhagavathy, “The Independent Components of Characters Are “Strokes”,” Proc. Fifth Int'l Conf. Document Analysis and Recognition, pp. 414-417, Sept. 1999.
[41] R. Bajaj, L. Dey, and S. Chaudhuri, “Devnagari Numeral Recognition by Combining Decision of Multiple Connectionist Classifiers,” Sadhana, vol. 27, no. part 1, pp. 59-72, 2002.
[42] U. Bhattacharya, T.K. Das, A. Datta, S.K. Parui, and B.B. Chaudhuri, “A Hybrid Scheme for Handprinted Numeral Recognition Based on a Self-Organizing Network and MLP Classifiers,” Int'l J. Pattern Recognition and Artificial Intelligence, vol. 16, no. 7, pp. 845-864, 2002.
[43] T.K. Bhowmick, U. Bhattacharya, and S.K. Parui, “Recognition of Bangla Handwritten Characters Using an MLP Classifier Based on Stroke Features,” Proc. 11th Int'l Conf. Neural Information Processing, pp. 814-819, 2004.
[44] M. Hanmandlu and O.V.R. Murthy, “Fuzzy Model Based Recognition of Handwritten Numerals,” Pattern Recognition, vol. 40, no. 6, pp. 1840-1854, 2007.
[45] P. Chinnuswamy and S. Krishnamoorthy, “Recognition of Handprinted Tamil Characters,” Pattern Recognition, vol. 12, no. 3, pp. 141-152, 1980.
[46] M.B. Sukhaswami, P. Seetharamulu, and A.K. Pujari, “Recognition of Telugu Characters Using Neural Networks,” Int'l J. Neural Systems, vol. 6, no. 3, pp. 317-357, 1995.
[47] R.M.K. Sinha, “Rule Based Contextual Post-Processing for Devanagari Text Recognition,” Pattern Recognition, vol. 20, pp.475-485, 1987.
[48] V. Bansal and R.M.K. Sinha, “Integrating Knowledge Sources in Devnagari Text Recognition System,” IEEE Trans. Systems, Man, and Cybernetics, vol. 30, no. 4, pp. 500-505, 2000.
[49] B.B. Chaudhuri and U. Pal, “Skew Angle Detection of Digitized Indian Script Documents,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 2, pp. 182-186, Feb. 1997.
[50] R.M.K. Sinha and H.N. Mahabala, “Machine Recognition of Devanagari Script,” IEEE Trans. Systems, Man, and Cybernetics, vol. 9, no. 8, pp. 435-441, 1979.
[51] B.B. Chaudhuri and U. Pal, “A Complete Printed Bangla OCR System,” Pattern Recognition, vol. 31, no. 5, pp. 531-549, 1998.
[52] U. Bhattacharya and B.B. Chaudhuri, “Databases for Research on Recognition of Handwritten Characters of Indian Scripts,” Proc. Eighth Int'l Conf. Document Analysis and Recognition (ICDAR '05), vol. 2, pp. 789-793, 2005.
[53] F. Kimura, OCR Technologies for Machine Printed and Hand Printed Japanese Text. Springer, pp. 49-71, 2007.
[54] N. Otsu, “A Threshold Selection Method from Grey-Level Histograms,” IEEE Trans. Systems, Man, and Cybernetics, vol. 9, pp. 377-393, 1979.
[55] I. Daubechies, “The Wavelet Transform, Time-Frequency Localization and Signal Analysis,” IEEE Trans. Information Theory, vol. 36, no. 5, pp. 961-1005, 1990.
[56] S.G. Mallat, “A Theory for Multiresolution Signal Decomposition: The Wavelet Representation,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 11, no. 7, pp. 674-693, July 1989.
[57] R.P. Lippmann, “An Introduction to Computing with Neural Nets,” IEEE ASSP Magazine, vol. 4, pp. 4-22, 1987.
[58] N. Wanas, G. Auda, M.S. Kamel, and F. Karray, “On the Optimal Number of Hidden Nodes in a Neural Network,” Proc. IEEE Canadian Conf. Electrical and Computer Eng., vol. 2, pp. 918-921, 1998.
[59] S. Haykin, Neural Networks: A Comprehensive Foundation, second ed. Prentice Hall, 1998.
[60] R.J. Ramteke, P.D. Borkar, and S.C. Mehrotra, “Recognition of Isolated Marathi Handwritten Numerals: An Invariant Moments Approach,” Proc. Int'l Conf. Cognition and Recognition, pp. 482-489, 2005.
[61] U. Bhattacharya, S.K. Parui, B. Shaw, and K. Bhattacharya, “Neural Combination of ANN and HMM for Handwritten Devnagari Numeral Recognition,” Proc. 10th Int'l Workshop Frontiers in Handwriting Recognition, pp. 613-618, 2006.
[62] K. Roy, C. Chaudhuri, U. Pal, and M. Kundu, “A Study on the Effect of Varying Training Set Sizes on Recognition Performance with Handwritten Bangla Numerals,” Proc. IEEE-India Ann. Conf., pp. 570-574, 2005.
26 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool