loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Sixth International Conference on Parallel and Distributed Computing Applications and Technologies (PDCAT'05)
A Novel Text Classification Algorithm Based on Na?ve Bayes and KL-Divergence
Dalian, China
December 05-December 08
ISBN: 0-7695-2405-2
Baoyi WANG, North China Electric Power University, China
Shaomin ZHANG, Xidian University, Xi?an, China
The Naive Bayes classifier is a popular machine learning method for text classification because it is fast and easy to implement and performs well. Its severe assumption that each feature word is independent with other feature words in a document makes higher efficiency possible but also adversely affects the quality of its results because some of feature words are interrelated. In this paper, in order to enhance the performance of the text classification, some solutions are proposed to some of the problems with Na?ve Bayes classifiers. Based on the original Naive Bayes algorithm, we take feature weight into account and make it a factor and combine KL-divergence (relative entropy) between the words to improve Na?ve Bayes classifier. The improved Na?ve Bayes classification algorithm is called INBA. By theory and experiment analyses it is proved that INBA algorithm not only has advantages of Na?ve Bayes classifier, but also results in higher classification accuracy, and the solutions are feasible, practical and effective.
Citation:
Baoyi WANG, Shaomin ZHANG, "A Novel Text Classification Algorithm Based on Na?ve Bayes and KL-Divergence," pdcat, pp.913-915, Sixth International Conference on Parallel and Distributed Computing Applications and Technologies (PDCAT'05), 2005
Usage of this product signifies your acceptance of the Terms of Use.