loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Incorporating Language Syntax in Visual Text Recognition with a Statistical Model
December 1996 (vol. 18 no. 12)
pp. 1251-1256

Abstract—The use of a statistical language model to improve the performance of an algorithm for recognizing digital images of handwritten or machine-printed text is discussed. A word recognition algorithm first determines a set of words (called a neighborhood) from a lexicon that are visually similar to each input word image. Syntactic classifications for the words and the transition probabilities between those classifications are input to the Viterbi algorithm. The Viterbi algorithm determines the sequence of syntactic classes (the states of an underlying Markov process) for each sentence that have the maximum a posteriori probability, given the observed neighborhoods. The performance of the word recognition algorithm is improved by removing words from neighborhoods with classes that are not included on the estimated state sequence.

An experimental application is demonstrated with a neighborhood generation algorithm that produces a number of guesses about the identity of each word in a running text. The use of zero, first and second order transition probabilities and different levels of noise in estimating the neighborhood are explored.

[1] L.R. Bahl, F. Jelinek, and R.L. Mercer, "A Maximum Likelihood Approach to Continuous Speech Recognition," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 5, no. 2, pp. 179-190, March, 1983.[2] L.R. Bahl, P.F. Brown, P.V.D. Souza, and R.L. Mercer, "A Treebased Statistical Language Model for Natural Language Speech Recognition," IEEE Trans. Acoustics, Speech, and Signal Processing, vol. 37, no. 1, pp. 1,001-1,008, July 1989.[3] G. DeSilva and J.J. Hull, "Proper Noun Detection in Document Images," Pattern Recognition, pp. 311-320, Feb. 1994.[4] W.N. Francis and H. Kucera, Frequency Analysis of English Usage: Lexicon and Grammar.Boston, Mass.: Houghton Mifflin Co., 1982.[5] I.J. Good, The Estimation of Probabilities.Cambridge, Mass.: M.I.T. Press, 1965.[6] T.K. Ho, J.J. Hull, and S.N. Srihari, "A Computational Model for Recognition of Multifont Word Images," Machine Vision and Applications, Special Issue on Document Image Analysis, pp. 157-168, Summer 1992.[7] J.J. Hull, "Interword Constraints in Visual Word Recognition," Proc. Conf. Canadian Soc. Computational Studies of Intelligence,Montreal, Canada, May21-23, 1986, pp. 134-138.[8] J.J. Hull, "Hypothesis Generation in a Computational Model for Visual Word Recognition," IEEE Expert, vol. 1, no. 3, pp. 6,370, Fall 1986.[9] J.J. Hull and S.N. Srihari, "A Computational Approach to Visual Word Recognition: Hypothesis Generation and Testing," IEEE CS Conf. Computer Vision and Pattern Recognition,Miami Beach, Fla., June22-26, 1986, pp. 156-161.[10] J.J. Hull, "A Hidden Markov Model for Language Syntax in Text Recognition," 11th IAPR Int'l Conf. Pattern Recognition, The Hague, The Netherlands, Aug.30- Sept.3, 1992, pp. 124-127.[11] F. Jelinek,“Self-organized language modeling for speech recognition,” in Readings in Speech Recognition, A. Waibel, and K.-F. Lee, eds., Morgan Kaufmann, pp. 450-506, 1990.[12] H. Kucera, and W.N. Francis, Computational Analysis of Present Day American English.Providence, R.I.: Brown Univ. Press, 1967.[13] R. Schwartz and Y.L. Chow, "The N-best Algorithm: An Efficient and Exact Procedure for Finding the N Most Likely Sentence Hypotheses," Proc. Int'l Conf. Acoustics, Speech, and Signal Processing, 1990, pp. 81-84.[14] N. Seshadri and C.E.W. Sundberg, "Generalized Viterbi Algorithms for Error Detection with Convolutional Codes," Proc. IEEE Global Telecommunications Conf. (GLOCOM), Dallas, Texas, Nov. 1989, pp. 1,534-1,538.[15] R. Shinghal and G.T. Toussaint, "Experiments in Text Recognition with the Modified Viterbi Algorithm," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 1, no. 2, pp. 184-192, Apr. 1979.

Index Terms:
Text recognition, OCR, document recognition, document analysis, syntax, language syntax, HMM, hidden Markov model, character recognition.
Citation:
Jonathan J. Hull, "Incorporating Language Syntax in Visual Text Recognition with a Statistical Model," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 18, no. 12, pp. 1251-1256, Dec. 1996, doi:10.1109/34.546261
Usage of this product signifies your acceptance of the Terms of Use.