Class Noise Handling for Effective Cost-Sensitive Learning by Cost-Guided Iterative Classification Filtering
Issue No. 10 - October (2006 vol. 18)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2006.155
Recent research in machine learning, data mining, and related areas has produced a wide variety of algorithms for cost-sensitive (CS) classification, where instead of maximizing the classification accuracy, minimizing the misclassification cost becomes the objective. These methods often assume that their input is quality data without conflict or erroneous values, or the noise impact is trivial, which is seldom the case in real-world environments. In this paper, we propose a Cost-guided Iterative Classification Filter (CICF) to identify noise for effective CS learning. Instead of putting equal weights on handling noise in all classes in existing efforts, CICF puts more emphasis on expensive classes, which makes it attractive in dealing with data sets with a large cost-ratio. Experimental results and comparative studies indicate that the existence of noise may seriously corrupt the performance of the underlying CS learners and by adopting the proposed CICF algorithm, we can significantly reduce the misclassification cost of a CS classifier in noisy environments.
Data mining, classification, cost-sensitive learning, noise handling.
Xindong Wu, Xingquan Zhu, "Class Noise Handling for Effective Cost-Sensitive Learning by Cost-Guided Iterative Classification Filtering", IEEE Transactions on Knowledge & Data Engineering, vol. 18, no. , pp. 1435-1440, October 2006, doi:10.1109/TKDE.2006.155