Issue No.01 - January/February (2007 vol.22)
Hae-Chang Rim , Korea University
Dongsuk Yook , Korea University
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/MIS.2007.4
Automatic word spacing decides the correct boundaries between words in a sentence. Word spacing is important in Korean, and word spacing errors are frequent. Several proposed probabilistic word-spacing models resolve problems with previous statistical approaches. These models regard automatic word spacing as a classification problem similar to part-of-speech tagging. By generalizing hidden Markov models, the models can consider a broader context and estimate more accurate probabilities. The authors tested these models under a wide range of conditions to compare them with the state of the art and performed detailed error analysis of them.
word spacing, probabilistic models, hidden Markov models, n-gram, machine learning
Hae-Chang Rim, Dongsuk Yook, "Automatic Word Spacing Using Probabilistic Models Based on Character n-grams", IEEE Intelligent Systems, vol.22, no. 1, pp. 28-35, January/February 2007, doi:10.1109/MIS.2007.4