International Conference on Information Technology: Coding and Computing (ITCC'04) Volume 2 An Approach to Improving the Quality of Part-of-Speech Tagging of Chinese Text Las Vegas, Nevada April 05-April 07 ISBN: 0-7695-2108-8
The disambiguation of multi-category words is one of the difficulties in part-of-speech tagging, which greatly affects the processing quality of corpora. Aiming at this question, the paper describes an approach to correcting the part-of-speech tagging of multi-category words automatically. It acquires correction rules for the part-of-speech tagging of multi-category words from right-tagged corpora based on the theory of rough sets and data mining, and then automatically corrects the corpora's part-of-speech tagging of multi-category words based on these rules. According to the results of close-test and open-test on the corpus of 500,000 Chinese characters, the accuracy of corpora can be increased by 11.32% and 5.97% respectively.
Citation:
Yi-li Qian, Jia-heng Zheng, "An Approach to Improving the Quality of Part-of-Speech Tagging of Chinese Text," itcc, vol. 2, pp.183, International Conference on Information Technology: Coding and Computing (ITCC'04) Volume 2, 2004 Usage of this product signifies your acceptance of the Terms of Use. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||