Subscribe
Issue No.08 - Aug. (2012 vol.34)
pp: 1469-1481
Fei Yin , Nat. Lab. of Pattern Recognition, Inst. of Autom., Beijing, China
Qiu-Feng Wang , Nat. Lab. of Pattern Recognition, Inst. of Autom., Beijing, China
Cheng-Lin Liu , Nat. Lab. of Pattern Recognition, Inst. of Autom., Beijing, China
ABSTRACT
This paper presents an effective approach for the offline recognition of unconstrained handwritten Chinese texts. Under the general integrated segmentation-and-recognition framework with character oversegmentation, we investigate three important issues: candidate path evaluation, path search, and parameter estimation. For path evaluation, we combine multiple contexts (character recognition scores, geometric and linguistic contexts) from the Bayesian decision view, and convert the classifier outputs to posterior probabilities via confidence transformation. In path search, we use a refined beam search algorithm to improve the search efficiency and, meanwhile, use a candidate character augmentation strategy to improve the recognition accuracy. The combining weights of the path evaluation function are optimized by supervised learning using a Maximum Character Accuracy criterion. We evaluated the recognition performance on a Chinese handwriting database CASIA-HWDB, which contains nearly four million character samples of 7,356 classes and 5,091 pages of unconstrained handwritten texts. The experimental results show that confidence transformation and combining multiple contexts improve the text line recognition performance significantly. On a test set of 1,015 handwritten pages, the proposed approach achieved character-level accurate rate of 90.75 percent and correct rate of 91.39 percent, which are superior by far to the best results reported in the literature.
INDEX TERMS
text analysis, Bayes methods, handwritten character recognition, learning (artificial intelligence), natural languages, pattern classification, probability, search problems, character-level correct rate, handwritten Chinese text offline recognition, integrated segmentation-and-recognition framework, character oversegmentation, path search, parameter estimation, multiple contexts, character recognition scores, geometric contexts, linguistic contexts, Bayesian decision, classifier, posterior probabilities, confidence transformation, beam search algorithm, search efficiency improvement, candidate character augmentation strategy, recognition accuracy improvement, supervised learning, path evaluation function optimization, maximum character accuracy criterion, recognition performance, Chinese handwriting database, CASIA-HWDB, unconstrained handwritten texts, text line recognition performance improvement, handwritten pages, character-level accurate rate, Character recognition, Text recognition, Context, Handwriting recognition, Hidden Markov models, Image segmentation, Lattices, maximum character accuracy training., Handwritten Chinese text recognition, confidence transformation, geometric models, language models, refined beam search, candidate character augmentation
CITATION
Fei Yin, Qiu-Feng Wang, Cheng-Lin Liu, "Handwritten Chinese Text Recognition by Integrating Multiple Contexts", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.34, no. 8, pp. 1469-1481, Aug. 2012, doi:10.1109/TPAMI.2011.264