loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
International Conference on Semantic Computing (ICSC 2007)
Khmer POS Tagger: A Transformation-based Approach with Hybrid Unknown Word Handling
Irvine, California
September 17-September 19
ISBN: 0-7695-2997-6
Chenda Nou, Waseda University, Japan
Wataru Kameyama, Waseda University, Japan
This paper presents an initiative research on Khmer part-of-speech tagger. We propose some modifications on applying rule algorithm of the transformation-based approach to adapt to Khmer language which is morphologically and syntactically different from the English language. Furthermore, to overcome the limited coverage of the rule-based approach in handling unknown words, we propose a hybrid approach to combine the rule-based and trigram models. Although training on a very small corpus, both proposed approaches achieve higher accuracy than the conventional methods. The tagger achieves 95.27% on training data and 91.96% on test data which includes 9% of unknown words.
Citation:
Chenda Nou, Wataru Kameyama, "Khmer POS Tagger: A Transformation-based Approach with Hybrid Unknown Word Handling," icsc, pp.482-492, International Conference on Semantic Computing (ICSC 2007), 2007
Usage of this product signifies your acceptance of the Terms of Use.