Asian Language Processing, International Conference on (2010)
Harbin, Heilongjiang China
Dec. 28, 2010 to Dec. 30, 2010
ISBN: 978-0-7695-4288-1
pp: 39-42
Dictionary mechanism is the basis of Chinese word segmentation, and its quality directly affects the speed and efficiency of Chinese word segmentation. In existing dictionary mechanisms, there are such shortages as space wasting, low efficiency, and difficult maintenance, and therefore, how to establish an effective mechanism is an urgent problem for Chinese word segmentation. In this paper, the idea of finite-state automaton is firstly studied, then a new kind of dictionary mechanism is proposed to save space and improve the speed of Chinese word segmentation as possible, and finally, the performances of various dictionary mechanisms are analyzed with theoretical study and experimental comparison. The result shows that compared with other mechanisms, the dictionary mechanism based on finite-state automaton proposed in the paper improves in space complexity and time complexity.
finite-state automaton, Chinese word segmentation, dictionary mechanism, complexity

