loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Fourth International Conference on Computer and Information Technology (CIT'04)
A Multi-Label Chinese Text Categorization System Based on Boosting Algorithm
Wuhan, China
September 14-September 16
ISBN: 0-7695-2216-5
Junli Chen, Zhejiang University
Xuezhong Zhou, Zhejiang University
Zhaohui Wu, Zhejiang University
This paper presents a multi-label Chinese text categorization system based on Chinese character features and boosting algorithm. This system has been successfully evaluated on the TCM-MED dataset provided by China academy of Traditional Chinese Medicine (TCM) and the Reuters-21578 benchmark. We suggest that the TCM-MED dataset can be used as a standard corpus for the Chinese text categorization tasks. We have also carried out experiments to compare the performance of the boosting algorithm with two other traditional algorithms on the same datasets. The results indicate that for the design of a multi-label Chinese text categorization system, the boosting algorithm has a high performance and outperforms the other two algorithms.
Citation:
Junli Chen, Xuezhong Zhou, Zhaohui Wu, "A Multi-Label Chinese Text Categorization System Based on Boosting Algorithm," cit, pp.1153-1158, Fourth International Conference on Computer and Information Technology (CIT'04), 2004
Usage of this product signifies your acceptance of the Terms of Use.