Issue No. 01 - January/February (2007 vol. 22)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/MIS.2007.7
Junfeng Pan , Hong Kong University of Science and Technology
Qiang Yang , Hong Kong University of Science and Technology
Yiming Yang , Sun Yat-sen University
Lei Li , Sun Yat-sen University
Frances Tianyi Li , Guangzhou E-DM Tech Corporation
George Wenmin Li , Guangzhou E-DM Tech Corporation
Telecommunications companies and financial institutions are facing increasing competition. A staged preprocessing framework for cost-sensitive-data processing can help these companies identify customers who might switch to a competitor (or churn). The framework gives users an intuitive idea of the data distribution using a self-organizing map and then uses a cost matrix to help convert the data with an improved equidepth discretization method. The preprocessed data set can be input to any classifier. When tested on the KDD Cup 1998 data set, the framework performed better than the competition's winner. It has also been implemented in a software product called ED-Money and applied to a Chinese mobile telecommunication data set.
data mining, cost-sensitive-data preprocessing, ensemble of classifiers
L. Li, Q. Yang, Y. Yang, G. W. Li, J. Pan and F. T. Li, "Cost-Sensitive-Data Preprocessing for Mining Customer Relationship Management Databases," in IEEE Intelligent Systems, vol. 22, no. , pp. 46-51, 2007.