The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.05 - May (2009 vol.31)
pp: 855-868
Alex Graves , Technische Universität, München, Munich
Marcus Liwicki , Research Group Knowledge Management, DFKI-German Research Center for Artificial Intelligence, Kaiserslautern
Santiago Fernández , IDSIA, Switzerland
Roman Bertolami , Institute of Computer Science and Applied Mathematics, Research Group on Computer Vision and Artificial Intelligence, Bern
Horst Bunke , Institute of Computer Science and Applied Mathematics, Research Group on Computer Vision and Artificial Intelligence, Bern
Jürgen Schmidhuber , Technische Universität, München, Munich
ABSTRACT
Recognizing lines of unconstrained handwritten text is a challenging task. The difficulty of segmenting cursive or overlapping characters, combined with the need to exploit surrounding context, has led to low recognition rates for even the best current recognizers. Most recent progress in the field has been made either through improved preprocessing or through advances in language modeling. Relatively little work has been done on the basic recognition algorithms. Indeed, most systems rely on the same hidden Markov models that have been used for decades in speech and handwriting recognition, despite their well-known shortcomings. This paper proposes an alternative approach based on a novel type of recurrent neural network, specifically designed for sequence labeling tasks where the data is hard to segment and contains long-range bidirectional interdependencies. In experiments on two large unconstrained handwriting databases, our approach achieves word recognition accuracies of 79.7 percent on online data and 74.1 percent on offline data, significantly outperforming a state-of-the-art HMM-based system. In addition, we demonstrate the network's robustness to lexicon size, measure the individual influence of its hidden layers, and analyze its use of context. Last, we provide an in-depth discussion of the differences between the network and HMMs, suggesting reasons for the network's superior performance.
INDEX TERMS
Handwriting recognition, online handwriting, offline handwriting, connectionist temporal classification, bidirectional long short-term memory, recurrent neural networks, hidden Markov model.
CITATION
Alex Graves, Marcus Liwicki, Santiago Fernández, Roman Bertolami, Horst Bunke, Jürgen Schmidhuber, "A Novel Connectionist System for Unconstrained Handwriting Recognition", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.31, no. 5, pp. 855-868, May 2009, doi:10.1109/TPAMI.2008.137
REFERENCES
[1] R. Seiler, M. Schenkel, and F. Eggimann, “Off-Line Cursive Handwriting Recognition Compared with On-Line Recognition,” Proc. 13th Int'l Conf. Pattern Recognition, vol. 4, p. 505, 1996.
[2] C. Tappert, C. Suen, and T. Wakahara, “The State of the Art in Online Handwriting Recognition,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 12, no. 8, pp. 787-808, Aug. 1990.
[3] R. Plamondon and S.N. Srihari, “On-Line and Off-Line Handwriting Recognition: A Comprehensive Survey,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 1, pp. 63-84, Jan. 2000.
[4] A. Vinciarelli, “A Survey on Off-Line Cursive Script Recognition,” Pattern Recognition, vol. 35, no. 7, pp. 1433-1446, 2002.
[5] H. Bunke, “Recognition of Cursive Roman Handwriting—Past Present and Future,” Proc. Seventh Int'l Conf. Document Analysis and Recognition, vol. 1, pp. 448-459, 2003.
[6] I. Guyon, L. Schomaker, R. Plamondon, M. Liberman, and S. Janet, “Unipen Project of On-Line Data Exchange and Recognizer Benchmarks,” Proc. 12th Int'l Conf. Pattern Recognition, pp. 29-33, 1994.
[7] J. Hu, S. Lim, and M. Brown, “Writer Independent On-Line Handwriting Recognition Using an HMM Approach,” Pattern Recognition, vol. 33, no. 1, pp. 133-147, Jan. 2000.
[8] C. Bahlmann and H. Burkhardt, “The Writer Independent Online Handwriting Recognition System Frog on Hand and Cluster Generative Statistical Dynamic Time Warping,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 26, no. 3, pp. 299-310, Mar. 2004.
[9] C. Bahlmann, B. Haasdonk, and H. Burkhardt, “Online Handwriting Recognition with Support Vector Machines—A Kernel Approach,” Proc. Eighth Int'l Workshop Frontiers in Handwriting Recognition, pp. 49-54, 2002.
[10] G. Wilfong, F. Sinden, and L. Ruedisueli, “On-Line Recognition of Handwritten Symbols,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 18, no. 9, pp. 935-940, Sept. 1996.
[11] K.M. Sayre, “Machine Recognition of Handwritten Words: A Project Report,” Pattern Recognition, vol. 5, no. 3, pp. 213-228, 1973.
[12] L. Schomaker, “Using Stroke- or Character-Based Self-Organizing Maps in the Recognition of On-Line, Connected Cursive Script,” Pattern Recognition, vol. 26, no. 3, pp. 443-450, 1993.
[13] E. Kavallieratou, N. Fakotakis, and G. Kokkinakis, “An Unconstrained Handwriting Recognition System,” Int'l J. Document Analysis and Recognition, vol. 4, no. 4, pp. 226-242, 2002.
[14] S. Bercu and G. Lorette, “On-Line Handwritten Word Recognition: An Approach Based on Hidden Markov Models,” Proc. Third Int'l Workshop Frontiers in Handwriting Recognition, pp. 385-390, 1993.
[15] T. Starner, J. Makhoul, R. Schwartz, and G. Chou, “Online Cursive Handwriting Recognition Using Speech Recognition Techniques,” Proc. IEEE Int'l Conf. Acoustics, Speech, and Signal Processing, vol. 5, pp. 125-128, 1994.
[16] J. Hu, M. Brown, and W. Turin, “HMM Based Online Handwriting Recognition,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 18, no. 10, pp. 1039-1045, Oct. 1996.
[17] U.-V. Marti and H. Bunke, “Using a Statistical Language Model to Improve the Performance of an HMM-Based Cursive Handwriting Recognition System,” Int'l J. Pattern Recognition and Artificial Intelligence, vol. 15, pp. 65-90, 2001.
[18] M. Schenkel, I. Guyon, and D. Henderson, “On-Line Cursive Script Recognition Using Time Delay Neural Networks and Hidden Markov Models,” Machine Vision and Applications, vol. 8, pp. 215-223, 1995.
[19] A. El-Yacoubi, M. Gilloux, R. Sabourin, and C. Suen, “An HMM-Based Approach for Off-Line Unconstrained Handwritten Word Modeling and Recognition,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 21, no. 8, pp. 752-760, Aug. 1999.
[20] L.R. Rabiner, “A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition,” Proc. IEEE, vol. 77, no. 2, pp.257-286, 1989.
[21] N.G. Bourbakis, “Handwriting Recognition Using a Reduced Character Method and Neural Nets,” Proc. SPIE Nonlinear Image Processing VI, vol. 2424, pp. 592-601, 1995.
[22] H. Bourlard and N. Morgan, Connnectionist Speech Recognition: A Hybrid Approach. Kluwer Academic Publishers, 1994.
[23] Y. Bengio, “Markovian Models for Sequential Data,” Neural Computing Surveys, vol. 2, pp. 129-162, Nov. 1999.
[24] A. Brakensiek, A. Kosmala, D. Willett, W. Wang, and G. Rigoll, “Performance Evaluation of a New Hybrid Modeling Technique for Handwriting Recognition Using Identical On-Line and Off-Line Data,” Proc. Fifth Int'l Conf. Document Analysis and Recognition, pp. 446-449, 1999.
[25] S. Marukatat, T. Artires, B. Dorizzi, and P. Gallinari, “Sentence Recognition through Hybrid Neuro-Markovian Modelling,” Proc. Sixth Int'l Conf. Document Analysis and Recognition, pp. 731-735, 2001.
[26] S. Jaeger, S. Manke, J. Reichert, and A. Waibel, “Online Handwriting Recognition: The NPen+ Recognizer,” Int'l J. Document Analysis and Recognition, vol. 3, no. 3, pp. 169-180, 2001.
[27] E. Caillault, C. Viard-Gaudin, and A.R. Ahmad, “MS-TDNN with Global Discriminant Trainings,” Proc. Eighth Int'l Conf. Document Analysis and Recognition, pp. 856-861, 2005.
[28] A.W. Senior and F. Fallside, “An Off-Line Cursive Script Recognition System Using Recurrent Error Propagation Networks,” Proc. Third Int'l Workshop Frontiers in Handwriting Recognition, pp. 132-141, 1993.
[29] A.W. Senior and A.J. Robinson, “An Off-Line Cursive Handwriting Recognition System,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 3, pp. 309-321, Mar. 1998.
[30] J. Schenk and G. Rigoll, “Novel Hybrid NN/HMM Modelling Techniques for On-Line Handwriting Recognition,” Proc. 10th Int'l Workshop Frontiers in Handwriting Recognition, pp. 619-623, 2006.
[31] A. Graves, S. Fernández, F. Gomez, and J. Schmidhuber, “Connectionist Temporal Classification: Labelling Unsegmented Sequence Data with Recurrent Neural Networks,” Proc. 23rd Int'l Conf. Machine Learning, pp. 369-376, 2006.
[32] A. Graves, S. Fernandez, M. Liwicki, H. Bunke, and J. Schmidhuber, “Unconstrained Online Handwriting Recognition with Recurrent Neural Networks,” Advances in Neural Information Processing Systems 20, J. Platt, D. Koller, Y. Singer, and S. Roweis, eds., 2008.
[33] A. Graves and J. Schmidhuber, “Framewise Phoneme Classification with Bidirectional LSTM and Other Neural Network Architectures,” Neural Networks, vol. 18, nos. 5-6, pp. 602-610, 2005.
[34] D. Moore, “The IDIAP Smart Meeting Room,” technical report, IDIAP-Com, 2002.
[35] M. Liwicki and H. Bunke, “HMM-Based On-Line Recognition of Handwritten Whiteboard Notes,” Proc. 10th Int'l Workshop Frontiers in Handwriting Recognition, pp. 595-599, 2006.
[36] U.-V. Marti and H. Bunke, “The IAM-Database: An English Sentence Database for Offline Handwriting Recognition,” Int'l J. Document Analysis and Recognition, vol. 5, pp. 39-46, 2002.
[37] S. Hochreiter, Y. Bengio, P. Frasconi, and J. Schmidhuber, “Gradient Flow in Recurrent Nets: The Difficulty of Learning Long-Term Dependencies,” A Field Guide to Dynamical Recurrent Neural Networks, S.C. Kremer and J.F. Kolen, eds., 2001.
[38] Y. Bengio, P. Simard, and P. Frasconi, “Learning Long-Term Dependencies with Gradient Descent Is Difficult,” IEEE Trans. Neural Networks, vol. 5, no. 2, pp. 157-166, Mar. 1994.
[39] S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural Computation, vol. 9, no. 8, pp. 1735-1780, 1997.
[40] F. Gers, N. Schraudolph, and J. Schmidhuber, “Learning Precise Timing with LSTM Recurrent Networks,” J. Machine Learning Research, vol. 3, pp. 115-143, 2002.
[41] M. Schuster and K.K. Paliwal, “Bidirectional Recurrent Neural Networks,” IEEE Trans. Signal Processing, vol. 45, pp. 2673-2681, Nov. 1997.
[42] P. Baldi, S. Brunak, P. Frasconi, G. Soda, and G. Pollastri, “Exploiting the Past and the Future in Protein Secondary Structure Prediction,” Bioinformatics, vol. 15, 1999.
[43] P. Baldi, S. Brunak, P. Frasconi, G. Pollastri, and G. Soda, “Bidirectional Dynamics for Protein Secondary Structure Prediction,” Lecture Notes in Computer Science, vol. 1828, pp. 80-104, 2001.
[44] T. Fukada, M. Schuster, and Y. Sagisaka, “Phoneme Boundary Estimation Using Bidirectional Recurrent Neural Networks and Its Applications,” Systems and Computers in Japan, vol. 30, no. 4, pp.20-30, 1999.
[45] A. Graves, S. Fernández, and J. Schmidhuber, “Bidirectional LSTM Networks for Improved Phoneme Classification and Recognition,” Proc. Int'l Conf. Artificial Neural Networks, pp. 799-804, 2005.
[46] J. Bridle, “Probabilistic Interpretation of Feedforward Classification Network Outputs, with Relationships to Statistical Pattern Recognition,” Neurocomputing: Algorithms, Architectures and Applications, F. Soulie and J. Herault, eds., pp. 227-236, 1990.
[47] R.J. Williams and D. Zipser, “Gradient-Based Learning Algorithms for Recurrent Connectionist Networks,” Backpropagation: Theory, Architectures, and Applications, Y. Chauvin and D.E.Rumelhart, eds., 1990.
[48] S. Young, N. Russell, and J. Thornton, “Token Passing: A Simple Conceptual Model for Connected Speech Recognition Systems,” technical report, Eng. Dept., Cambridge Univ., 1989.
[49] M. Liwicki and H. Bunke, “IAM-OnDB—An On-Line English Sentence Database Acquired from Handwritten Text on a Whiteboard,” Proc. Eighth Int'l Conf. Document Analysis and Recognition, vol. 2, pp. 956-961, 2005.
[50] S. Johansson, The Tagged LOB Corpus: User's Manual. Norwegian Computing Centre for the Humanities, 1986.
[51] M. Zimmermann, J.-C. Chappelier, and H. Bunke, “Offline Grammar-Based Recognition of Handwritten Sentences,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 28, no. 5, pp.818-821, May 2006.
[52] W.N. Francis and H. Kucera, Brown Corpus Manual, Manual of Information to Accompany a Standard Corpus of Present-Day Edited American English, for Use with Digital Computers. Dept. of Linguistics, Brown Univ., Providence, R.I., 1979.
[53] L. Bauer, Manual of Information to Accompany the Wellington Corpus of Written New Zealand English. Dept. of Linguistics, Victoria Univ., Wellington, New Zealand, 1993.
[54] X. Huang, A. Acero, and H.-W. Hon, Spoken Language Processing: A Guide to Theory, Algorithm and System Development. Prentice Hall, 2001.
[55] J.A. Pittman, “Handwriting Recognition: Tablet PC Text Input,” Computer, vol. 40, no. 9, pp. 49-54, Sept. 2007.
[56] M.T. Johnson, “Capacity and Complexity of HMM Duration Modeling Techniques,” IEEE Signal Processing Letters, vol. 12, no. 5, pp. 407-410, May 2005.
4 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool