This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Off-Line Handwritten Chinese Character Recognition as a Compound Bayes Decision Problem
September 1998 (vol. 20 no. 9)
pp. 1016-1023

Abstract—A handwritten Chinese character off-line recognizer based on Contextual Vector Quantization (CVQ) of every pixel of an unknown character image has been constructed. Each template character is represented by a codebook. When an unknown image is matched against a template character, each pixel of the image is quantized according to the associated codebook by considering not just the feature vector observed at each pixel, but those observed at its neighbors and their quantizations as well. Structural information such as stroke counts observed at each pixel are captured to form a cellular feature vector. Supporting a vocabulary of 4,616 simplified Chinese characters and alphanumeric and punctuation symbols, the writer-independent recognizer has an average recognition rate of 77.2 percent. Three statistical language models for postprocessing have been studied for their effectiveness in upgrading the recognition rate of the system. Among them, the CVQ-based language model is the most effective one upgrading the recognition rate by 10.4 percent on the average.

[1] S.Y. Lu and C.L. Lin, "A Structural Approach to Chinese Character Recognition," Proc. ICCS'80, pp. 935-941, 1980.
[2] H.T. Tsui, "Guided Stroke Structure Extraction for the Recognition of Handprinted Chinese Characters," Proc. ICPR'82, pp. 786-788, 1982.
[3] X. Zhang and Y. Xia, "The Automatic Recognition of Handprinted Chinese Characters—A Method of Extracting an Ordered Sequence of Strokes," Pattern Recognition Letters, vol. 1, pp. 259-265, 1983.
[4] J.W. Tai, "A Syntactic-Semantic Approach for Chinese Character Recognition," Proc. ICPR'84, pp. 374-376, 1984.
[5] S. Kuo and O. Agazzi, "Machine Vision for Keyword Spotting Using Pseudo 2D Hidden Markov Models," Proc. ICASSP'93, pp. V81-V84, 1993.
[6] O. Agazzi, S. Kuo, E. Levin, and R. Pieraccini, "Connected and Degraded Text Recognition Using Planar Hidden Markov Models," Proc. ICASSP'93, pp. V113-V116, 1993.
[7] Q. Huo and C. Chan, "Contextual Vector Quantization for Speech Recognition With Discrete Hidden Markov Model," Pattern Recognition, vol. 28, pp. 513-517, 1995.
[8] T.H. Hildebrandt and W.T. Liu, "Optical Recognition of Handwritten Chinese Characters: Advances Since 1980," Pattern Recognition, vol. 26, no. 2, pp. 205-225, 1993.
[9] A.K.C. Wong and M. You, "Entropy and Distance of Random Graphs With Application to Structural Pattern Recognition," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 7, no. 5, pp. 599-609, Nov. 1985.
[10] J. Kittler and D. Pairman, "Contextual Pattern Recognition Applied to Cloud Detection and Identification," IEEE Trans. Geoscience Remote Sensing, vol. 23, pp. 855-863, 1985.
[11] J. Haslett, "Maximum Likelihood Discriminant Analysis on the Plane Using a Markovian Model of Spatial Context," Pattern Recognition, vol. 18, pp. 287-296, 1985.
[12] J. Besag, "On the Statistical Analysis of Dirty Pictures," J. Royal Statistical Soc. B, vol. 48, pp. 259-302, 1986.
[13] Xiandai Hanyu Pinlu Cidian, Beijing Inst. of Linguistics, pp. 1,300-1,387, 1986.
[14] A. Arumugam, T. Radhakrishnan, C.Y. Suen, and P.S.P. Wang, "A Thinning Algorithm Based on the Force Between Charged Particles," Thinning Methodologies for Pattern Recognition, pp. 23-44. World Scientific, 1994.
[15] S. Suzuki and K. Abe, "Sequential Thinning of Binary Pictures Using Distance Transformation," Proc. ICPR'86, pp. 289-292, 1986.
[16] S. Yokoi, J.I. Toriwaki, and T. Fukumura, "An Analysis of Topological Properties of Digitized Binary Pictures Using Local Features," Computer Graphics and Image Processing, vol. 4, pp. 63-73, 1975.
[17] R. Suchenwirth, J. Guo, I. Hartmann, G. Hincha, M. Krause, and Z. Zhang, Optical Recognition of Chinese Characters, pp. 61-66.Braunschweig, Germany: Vieweg, 1989.
[18] K.T. Lua, "From Character to Word—An Application of Information Theory," Computer Processing of Chinese and Oriental Languages, vol. 4, no. 4, pp. 304-313, Mar. 1990.
[19] Y. Liu, Q. Tan, and K.X. Shen, "The Word Segmentation Rules and Automatic Word Segmentation Methods for Chinese Information Processing (in Chinese)," p. 36.Tsinghua, China: Tsinghua Univ. Press and Guangxi Science and Technology Press, 1994.
[20] "Worddata," Chinese Knowledge Information Processing Group, Technical Report no. 93-05, Inst. of Information Science, Academic Sinica, Taiwan, 1993.
[21] J.L. Devore, Probability and Statistics for Engineering and Sciences, pp. 272-276. Duxbury Press, 1991.
[22] W. Eckert, F. Gallwitz, and H. Niemann, "Combining Stochastic and Linguistic Language Models for Recognition of Spontaneous Speech," Proc. IEEE Int'l Conf. Acoustics, Speech, and Signal Processing, pp. 423-426,Atlanta, 1996.

Index Terms:
Off-line handwritten Chinese character recognition, Chinese language modeling, compound Bayes decision, contextual vector quantization, Chinese word segmentation.
Citation:
Pak-Kwong Wong, Chorkin Chan, "Off-Line Handwritten Chinese Character Recognition as a Compound Bayes Decision Problem," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, no. 9, pp. 1016-1023, Sept. 1998, doi:10.1109/34.713366
Usage of this product signifies your acceptance of the Terms of Use.