loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Eighth ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing (SNPD 2007)
An Effective Method To Improve kNN Text Classifier
Haier International Training Center, Qingdao, China
July 30-August 01
ISBN: 0-7695-2909-7
Xiulan Hao, Fudan University, China
Xiaopeng Tao, Fudan University, China
Chenghong Zhang, Fudan University, China
Yunfa Hu, Fudan University, China
Many of standard classification algorithms usually assume that the training examples are evenly distributed among different classes. However, unbalanced data sets often appear in many applications. As a simple, effective categorization method, kNN is widely used, but it suffers from biased data sets, too. In developing the Prototype of Internet Information Security for Shanghai Council of Information and Security, we detect that when training data set is biased, almost all test documents of some rare categories are classified into common ones. To alleviate such a misfortune, we propose a novel concept, critical point (CP), and adapt traditional kNN by integrating CP's approximate value, LB or UB, training number with decision rules. Exhaustive experiments illustrate that the adapted kNN achieves significant classification performance improvement on biased corpora.
Citation:
Xiulan Hao, Xiaopeng Tao, Chenghong Zhang, Yunfa Hu, "An Effective Method To Improve kNN Text Classifier," snpd, vol. 1, pp.379-384, Eighth ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing (SNPD 2007), 2007
Usage of this product signifies your acceptance of the Terms of Use.