loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2008 International Symposiums on Information Processing
An Incremental Chinese Text Classification Algorithm Based on Quick Clustering
May 23-May 25
ISBN: 978-0-7695-3151-9
Most conventional incremental learning algorithms perform incremental learning by selecting only one optimized text sample each time, which neither considers the relationship between texts in the unlabeled text set, nor improves incremental learning efficiency. In addition, because of the shortage of the classifier’s information storage, the selected optimized text is easily classified incorrectly. And the consequence of selecting wrong labeled text will reduce incremental learning precision. For overcoming these problems mentioned above, a new incremental learning algorithm based on quick clustering is proposed in this paper. On the one hand, it improves incremental learning efficiency by clustering all similar texts in unlabeled text set. All texts which are the centers of text clusters are selected as a representative text set. Then the incremental learning process is to choose texts in the representative text set under the 0-1 loss rate. On the other hand, for improving incremental learning precision, a new method for choosing reasonable learning sequence is proposed, which not only strengthen the positive impact of the more mature data on classification but also weaken the negative impact of the noisy data. The experimental results show that the classification efficiency and precision are both increased by using the algorithm.
Index Terms:
Incremental learning, Text classification, Affinity propagation, Bayes, Text clustering
Citation:
Houfeng Ma, Xinghua Fan, Ji Chen, "An Incremental Chinese Text Classification Algorithm Based on Quick Clustering," isip, pp.308-312, 2008 International Symposiums on Information Processing, 2008
Usage of this product signifies your acceptance of the Terms of Use.