This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
An Instance-Weighting Method to Induce Cost-Sensitive Trees
May/June 2002 (vol. 14 no. 3)
pp. 659-665

Abstract—We introduce an instance-weighting method to induce cost-sensitive trees. It is a generalization of the standard tree induction process where only the initial instance weights determine the type of tree to be induced—minimum error trees or minimum high cost error trees. We demonstrate that it can be easily adapted to an existing tree learning algorithm. Previous research provides insufficient evidence to support the idea that the greedy divide-and-conquer algorithm can effectively induce a truly cost-sensitive tree directly from the training data. We provide this empirical evidence in this paper. The algorithm incorporating the instance-weighting method is found to be better than the original algorithm in terms of total misclassification costs, the number of high cost errors, and tree size in two-class data sets. The instance-weighting method is simpler and more effective in implementation than a previous method based on altered priors.

[1] C. Blake, E. Keogh, and C.J. Merz, “UCI Repository of Machine Learning Databases,” Univ. of California, Dept. Information and Computer Science,http://www.ics.uci.edu/~mlearnMLRepository.html , 1998.
[2] J.P. Bradford, C. Kunz, R. Kohavi, C. Brunk, and C.E. Brodley, “Pruning Decision Trees with Misclassification Costs,” Proc. 10th European Conf. Machine Learning, pp. 131-136, 1998.
[3] L. Breiman, J.H. Friedman, R.A. Olshen, and C.J. Stone, Classification and Regression Trees. Belmont, Cailf.: Wadsworth, 1984.
[4] U. Knoll, G. Nakhaeizadeh, and B. Tausend, "Cost sensitive pruning of decision trees," Machine Learning: Proc. ECML, vol. 94, pp. 383-386, 1994.
[5] D. Michie, D.J. Spiegelhalter, and C.C. Taylor, Machine Learning, Neural and Statistical Classification. Ellis Horwood Limited, 1994.
[6] S.W. Norton, “Generating Better Decision Trees,” Proc. 11th Int'l Joint Conf. Artificial Intelligence, pp. 800-805, 1989.
[7] M. Nunez, "The Use of Background Knowledge in Decision Tree Induction," Machine Learning, vol. 6, pp. 231-250, 1991.
[8] M. Pazzani, C. Merz, P. Murphy, K. Ali, T. Hume, and C. Brunk, “Reducing Misclassification Costs,” Proc. 11th Int'l Conf. Machine Learning, pp. 217-225, 1994.
[9] J.R. Quinlan, C4.5: Programs for Machine Learning,San Mateo, Calif.: Morgan Kaufman, 1992.
[10] J.R. Quinlan, “Boosting, Bagging, and C4.5,” Proc. 13th Nat'l Conf. Artificial Intelligence, pp. 725-730, 1996.
[11] J.R. Quinlan, “C5,” http:/rulequest.com, 1997.
[12] R.E. Schapire, Y. Freund, P. Bartlett, and W.S. Lee, “Boosting the Margin: A New Explanation for the Effectiveness of Voting Methods,” Proc. 14th Int'l Conf. Machine Learning, 1997.
[13] M. Tan, “Cost-Sensitive Learning of Classification Knowledge and Its Applications in Robotics,” Machine Learning, vol. 13, pp. 7-33, 1993.
[14] K.M. Ting, “A Comparative Study of Cost-Sensitive Boosting Algorithms,” Proc. 17th Int'l Conf. Machine Learning, pp. 983-990, 2000.
[15] K.M. Ting and Z. Zheng, “Boosting Cost-Sensitive Trees,” Proc. First Int'l Conf. Discovery Science, pp. 244-255, 1998.
[16] P.D. Turney, “Cost-Sensitive Classification: Empirical Evaluation of a Hybrid Genetic Decision Tree Induction Algorithm,” J. Artificial Intelligence Research, vol. 2, pp. 369-409, 1995.
[17] G.I. Webb, “Cost-Sensitive Specialization,” Proc. 1996 Pacific Rim Int'l Conf. Artificial Intelligence, pp. 23-34, 1996.

Index Terms:
Cost-sensitive, decision trees, induction, greedy divide-and-conquer algorithm, instance weighting
Citation:
K.M. Ting, "An Instance-Weighting Method to Induce Cost-Sensitive Trees," IEEE Transactions on Knowledge and Data Engineering, vol. 14, no. 3, pp. 659-665, May-June 2002, doi:10.1109/TKDE.2002.1000348
Usage of this product signifies your acceptance of the Terms of Use.