2010 Second International Conference on Knowledge and Systems Engineering A Semi-supervised Learning Method for Vietnamese Part-of-Speech Tagging Hanoi, Hanoi Vietnam October 07-October 09 ISBN: 978-0-7695-4213-3
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/KSE.2010.35
This paper presents a semi-supervised learning method for Vietnamese part of speech tagging. We take into account two powerful tagging models including Conditional Random Fields (CRFs)and the Guided Online-Learning models (GLs) as base learning models. We then propose a semi-supervised learning tagging model for both CRFs and GLs methods. The main idea is to use of a word-cluster model as an associate source for enrich the feature space of discriminate learning models for both training and decoding processes. Experimental results on Vietnamese Tree-bank data (VTB) showed that the proposed method is effective. Our best model achieved accuracy of 94.10\% when tested on VTB, and 92.60\% an independent test.
Index Terms:
Part of Speech tagging, Guided Learning, Semi-Supervised Learning, Conditional Random Fields
Citation:
Le Minh Nguyen, Bach Ngo Xuan, Cuong Nguyen Viet, Minh Pham Quang Nhat, Akira Shimazu, "A Semi-supervised Learning Method for Vietnamese Part-of-Speech Tagging," kse, pp.141-146, 2010 Second International Conference on Knowledge and Systems Engineering, 2010 Usage of this product signifies your acceptance of the Terms of Use. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||