CSDL Home IEEE Transactions on Pattern Analysis & Machine Intelligence 2009 vol.31 Issue No.10 - October
Issue No.10 - October (2009 vol.31)
Jerod J. Weinman , Grinnell College, Grinnell
Erik Learned-Miller , University of Massachusetts Amherst, Amherst
Allen R. Hanson , University of Massachusetts Amherst, Amherst
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TPAMI.2009.38
Scene text recognition (STR) is the recognition of text anywhere in the environment, such as signs and storefronts. Relative to document recognition, it is challenging because of font variability, minimal language context, and uncontrolled conditions. Much information available to solve this problem is frequently ignored or used sequentially. Similarity between character images is often overlooked as useful information. Because of language priors, a recognizer may assign different labels to identical characters. Directly comparing characters to each other, rather than only a model, helps ensure that similar instances receive the same label. Lexicons improve recognition accuracy but are used post hoc. We introduce a probabilistic model for STR that integrates similarity, language properties, and lexical decision. Inference is accelerated with sparse belief propagation, a bottom-up method for shortening messages by reducing the dependency between weakly supported hypotheses. By fusing information sources in one model, we eliminate unrecoverable errors that result from sequential processing, improving accuracy. In experimental results recognizing text from images of signs in outdoor scenes, incorporating similarity reduces character recognition error by 19 percent, the lexicon reduces word recognition error by 35 percent, and sparse belief propagation reduces the lexicon words considered by 99.9 percent with a 12X speedup and no loss in accuracy.
Scene text recognition, optical character recognition, conditional random fields, factor graphs, graphical models, lexicon, language model, similarity, belief propagation, sparse belief propagation.
Jerod J. Weinman, Erik Learned-Miller, Allen R. Hanson, "Scene Text Recognition Using Similarity and a Lexicon with Sparse Belief Propagation", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.31, no. 10, pp. 1733-1746, October 2009, doi:10.1109/TPAMI.2009.38