This Article 
 Bibliographic References 
 Add to: 
Using Generative Models for Handwritten Digit Recognition
June 1996 (vol. 18 no. 6)
pp. 592-606

Abstract—We describe a method of recognizing handwritten digits by fitting generative models that are built from deformable B-splines with Gaussian "ink generators" spaced along the length of the spline. The splines are adjusted using a novel elastic matching procedure based on the Expectation Maximization (EM) algorithm that maximizes the likelihood of the model generating the data. This approach has many advantages. 1) After identifying the model most likely to have generated the data, the system not only produces a classification of the digit but also a rich description of the instantiation parameters which can yield information such as the writing style. 2) During the process of explaining the image, generative models can perform recognition driven segmentation. 3) The method involves a relatively small number of parameters and hence training is relatively easy and fast. 4) Unlike many other recognition schemes, it does not rely on some form of pre-normalization of input images, but can handle arbitrary scalings, translations and a limited degree of image rotation. We have demonstrated our method of fitting models to images does not get trapped in poor local minima. The main disadvantage of the method is it requires much more computation than more standard OCR techniques.

[1] R. Durbin, R. Szeliski, and A.L. Yuille, "An Analysis of the Elastic Net Approach to the Travelling Salesman Problem," Neural Computation, vol. 1, pp. 348-358, 1989.
[2] S. Impedovo, Fundamentals in Handwriting Recognition. Springer-Verlag, 1994.
[3] C.Y. Suen, “Computer Recognition of Unconstrained Handwritten Numerals,” Proc. IEEE, vol. 80, pp. 1,162-1,180, 1992.
[4] G.L. Cash and M Hatamian, "Optical Character Recognition by the Method of Moments," Computer Vision, Graphics, and Image Processing, vol. 39, pp. 291-310, 1987.
[5] L. Lam and C.Y. Suen, "Structural Classification and Relaxation Matching of Totally Unconstrained Handwritten Zip-Code Numbers," Pattern Recognition, vol. 21, no. 1, pp. 19-31, 1988.
[6] M. Shridhar and A. Badreldin, “Recognition of Isolated and Simply Connected Handwritten Numerals,” Pattern Recognition, vol. 19, no. 1, pp. 1-12, 1986.
[7] Y. LeCun, B. Boser, J.S. Denker, D. Henderson, R.E. Howard, W. Hubbard, and L.D. Jackel, “Handwritten Digit Recognition with a Backpropagation Network,” Advances in Neural Information Processing Systems 2, D.S. Touretzky, ed., pp. 396-404, San Mateo, Calif.: Morgan Kaufmann, 1990.
[8] J.D. Keeler, D.E. Rumelhart, and W.K. Leow, "Integrated Segmentation and Recognition of Hand-Printed Numerals," Advances in Neural Information Processing Systems 3, R.P. Lippmann, J.E. Moody, and D.S. Touretzky, eds., pp. 557-563.San Mateo: Morgan Kaufmann, 1991.
[9] K. Fukushima and N. Wake, “Handwritten Alphanumeric Character Recognition by the Neocognitron,” IEEE Trans Neural Networks, vol. 2, pp. 355-365, 1991.
[10] D. Lee and S.N. Srihari, "Handprinted Digit Recognition: A Comparison of Algorithms," Third Int'l Workshop on Frontiers in HandWriting Recognition, pp. 153-162,Buffalo, NY, May 1993.
[11] F. Kimura and M. Sridhar, “Handwritten Numerical Recognition Based on Multiple Algorithms,” Pattern Recognition, vol. 24, no. 10, pp. 969-983, 1991.
[12] J. Geist et al., "NISTIR 5452. The Second Census Optical Character Recognition Systems Conference," Technical Report, U.S. National Institute of Standards and Technology, 1994.
[13] P.Y. Simard, Y. LeCun, and J. Denker, "Efficient Pattern Recognition Using a New Transformation Distance," Advances in Neural Information Processing Systems, pp. 50-58.San Mateo, Calif.: Morgan Kaufman, 1993.
[14] C.J.C. Burges et al., "Shortest Path Segmentation: A Method for Training a Neural Network to Recognize Character Strings," IJCNN, vol. 3, pp. 165-171, 1992.
[15] Y. Lee, "Handwritten Digit Recognition Using k- Nearest-Neighbor, Radial-Basis Function, and Backpropogation Neural Networks," Neural Computation, vol. 3, pp. 440-449, 1991.
[16] J. Bromley and J. Denker, "Improving Rejection Performance on Handwritten Digits by Training With Rubbish," Neural Computation, vol. 5, no. 3, pp. 367-370, 1993.
[17] D.G. Lowe, Perceptual Organization and Visual Recognition. Boston: Kluwer Academic, 1985.
[18] G.E. Hinton, C.K.I. Williams, and M.D. Revow, "Adaptive Elastic Models for Hand-Printed Character Recognition," Advances in Neural Information Processing Systems 4, J.E. Moody, S.J. Hanson, and R.P. Lippmann, eds., Morgan Kauffmann, 1992.
[19] J.R. Ullmann, "Correspondence in Character Recognition," Machine Perception of Patterns and Pictures. Institute of Physics, London, 1972.
[20] B. Widrow, "The 'Rubber-Mask' Technique-I. Pattern Measurement and Analysis," Pattern Recognition, vol. 5, pp. 175-197, 1973.
[21] D.J. Burr, "A Dynamic Model for Image Registration," Computer Graphics Image Process., vol. 15, pp. 102-112, 1981.
[22] D.J. Burr, "Elastic Matching of Line Drawings," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 3, no. 6, pp. 708-713, 1981.
[23] D.J. Burr, "Matching Elastic Templates," Physical and Biological Proc. Images: Proc. Int'l Symp. Organized By the Rank Prize Funds, O.J. Braddick and A.C. Sleigh, eds., Springer-Verlag, 1983.
[24] M. Moshfeghi, "Elastic Matching of Multimodality Medical Images," Computer Vision, Graphics, and Image Processing: Graphical Models and Image Processing, vol. 53, no. 3, pp. 271-282, 1991.
[25] M. Varga and R. Hanka, "Dynamic Elastic Image Stretching Technique Applied to Thermographic Images," IEE Proc., vol. 137, pt. 1, no. 3, pp. 146-156, 1990.
[26] R. Bajscy and S. Kovacic, "Multiresolution Elastic Matching," Computer Vision, Graphics&Image Processing, vol. 46, no. 1, pp. 1-21, 1989.
[27] R. Bajcsy, R. Lieberson, and M. Reivich, "A Computerized System for the Elastic Matching of Deformed Radiographic Images to Idealized Atlas Images," J. Computer Assisted Tomography, vol. 7, no. 4, pp. 618-625, 1983.
[28] M.A. Fischler and M.A. Elschlager, "The Representation and Matching of Pictorial Structures," IEEE Trans. Computers, vol. 22, no. 1, pp. 67-92, Jan. 1973.
[29] A.L. Yuille, "Deformable Templates for Face Recognition," J. Cognitive Neuroscience, vol. 3, no. 1, pp. 59-70, 1991.
[30] M. Kass, A. Witkin, and D. Terzopoulos, "Snakes: Active Contour Models," Proc. First Int'l Conf. Computer Vision,Washington, D.C., IEEE Computer Society Press, 1987.
[31] A. Lanitis, C.J. Taylor, and T.F. Cootes, "A Generic System for Classifying Variable Objects Using Flexible Template Matching," Proc. British Machine Vision Conf., J. Illingworth, ed., vol. 1, pp. 329-338. BMVA Press, 1993.
[32] B.S. Everitt, An Introduction to Latent Variable Models. Chapman and Hall, 1984.
[33] J.D. Foley,A. van Dam,S.K. Feiner,, and J.F. Hughes,Computer Graphics: Principles and Practice,Menlo Park, Calif.: Addison-Wesley, 1990.
[34] Y.S. Chow, U. Grenander, and D.M. Keenan, HANDS. A Pattern-Theoretic Study Of Biological Shapes.New York: Springer-Verlag, 1991.
[35] S. Edelman, T. Flash, and S. Ullman, “Reading Cursive Handwriting by Alignment of Letter Prototypes,” Int'l J. Computer Vision, vol. 5, no. 3, pp. 303-331, 1990.
[36] C.K.I. Williams, M.D. Revow, and G.E. Hinton, "Hand-Printed Digit Recognition Using Deformable Models," Spatial Vision in Humans and Robots, L. Harris and M. Jenkin, eds. Cambridge Univ. Press, 1993.
[37] J. Bertille, "An Elastic Matching Approach Applied to Digit Recognition," Proc. Second Int'l Conf. Document Analysis and Recognition, pp. 82-85, IEEE Computer Society Press, Los Alamitos, 1993.
[38] D.J.C. MacKay, “Bayesian Interpolation,” Neural Computation, vol. 4, no. 3, pp. 415-447, 1992.
[39] R. Durbin and D. Willshaw, "An Analogue Approach to the Travelling Salesman Problem," Nature, vol. 326, pp. 689-691, 1987.
[40] A.P. Dempster, N.M. Laird, and D.B. Rubin, "Maximum Likelihood from Incomplete Data Via the EM Aalgorithm," Proc. Royal Statistical Society, vol. 39, pp. 1-38, 1997.
[41] X. Meng and D.B. Rubin, "Recent Extensions to the EM Algorithm," Bayesian Statistics 4, J.M. Bernardo, J.O. Berger, A.P. Dawid, and A.F.M. Smith, eds., pp. 307-320. Oxford Univ. Press, 1992.
[42] J. Hampshire and A. Waibel, "A Novel Objective Function for Improved Phoneme Recognition Using Time-Delay Neural Networks," Technical Report CMU-CS-89-118, Pittsburgh: Carnegie Mellon Univ., 1989.
[43] P. Brown, The Acoustic-Modeling Problem in Automatic Speech Recognition, PhD thesis, Carnegie Mellon Univ., 1987. Also published as IBM Research Division Technical Report RC 12750.
[44] C.K. Williams, “Combining Deformable Models and Neural Networks for Hand-Printed Digit Recognition,” PhD thesis, Dept. Computer Science, Univ. of Toronto, 1994.
[45] P. Dayan, G.E. Hinton, R.M. Neal, and R.S. Zemel, The Helmholtz Machine Neural Computation, vol. 7, no. 5, pp. 889-904, 1995.
[46] J.J. Hull, “A Database for Handwritten Text Recognition Research,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 16, no. 5, pp. 550-554, May 1994.
[47] J.S. Bridle, "Probabilistic Interpretation of Feedforward Classification Network Outputs, With Relationships to Statistical Pattern Recognition.," Neuro-Computing: Algorithms, Architectures and Applications, F. Fougelman-Soulie and J. Hérault, eds., NATO ASI Series on Systems and Computer Science. Springer-Verlag, 1990.
[48] T. Hastie and R. Tibshirani, Handwritten Digit Recognition Via Deformable Prototypes, Technical Report, Dept. of Statistics, Univ. of Toronto, 1992.
[49] T.M. Ha and H. Bunke, "Handwritten Numeral Recognition By Perturbation Method," Proc., Fourth Int'l Workshop on Handwriting Recognition, pp. 97-106, 1994.
[50] G.E. Hinton, M. Revow, and P. Dayan, "Recognizing Handwritten Digits Using Mixtures of Linear Models," Advances in Neural Information Processing Systems 7, G. Tesauro, D.S. Touretzky, and T.K. Leen, eds., pp. 1,015-1,022. MIT Press, Cambridge Mass., 1995.
[51] C.K.I. Williams, M. Revow, and G.E. Hinton, "Using a Neural Net to Instantiate a Deformable Model," Advances in Neural Information Processing Systems 7, G. Tesauro, D.S. Touretzky, and T.K. Leen, eds., pp. 965-972. MIT Press, Cambridge Mass., 1995.
[52] A. Gupta, M.V. Nagendraprasad, A. Liu, P.S.P. Wang, and S. Ayyadurai, "An Integrated Architecture for Recognition of Totally Unconstrained Handwritten Numerals," Int'l J. Pattern Recognition and Artificial Intelligence, vol. 7, no. 4, pp. 757-773, 1993.

Index Terms:
Deformable model, elastic net, optical character recognition, generative model, probabilistic model, mixture model.
Michael Revow, Christopher K.I. Williams, Geoffrey E. Hinton, "Using Generative Models for Handwritten Digit Recognition," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 18, no. 6, pp. 592-606, June 1996, doi:10.1109/34.506410
Usage of this product signifies your acceptance of the Terms of Use.