This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
"Missing Is Useful': Missing Values in Cost-Sensitive Decision Trees
December 2005 (vol. 17 no. 12)
pp. 1689-1693
Many real-world data sets for machine learning and data mining contain missing values and much previous research regards it as a problem and attempts to impute missing values before training and testing. In this paper, we study this issue in cost-sensitive learning that considers both test costs and misclassification costs. If some attributes (tests) are too expensive in obtaining their values, it would be more cost-effective to miss out their values, similar to skipping expensive and risky tests (missing values) in patient diagnosis (classification). That is, "missing is useful” as missing values actually reduces the total cost of tests and misclassifications and, therefore, it is not meaningful to impute their values. We discuss and compare several strategies that utilize only known values and that "missing is useful” for cost reduction in cost-sensitive decision tree learning.

[1] K.M. Ali and M.J. Pazzani, “Hydra: A Noise-Tolerant Relational Concept Learning Algorithm,” Proc. 13th Int'l Joint Conf. Artificial Intelligence (IJCAI93), R. Bajesy, ed., pp. 1064-1071, 1993.
[2] L. Breiman, J.H. Friedman, R.H. Olshen, and C.J Stone, Classification and Regression Trees. Belmont, Calif.: Wadsworth, 1984.
[3] C.L. Blake and C.J. Merz, UCI Repository of Machine Learning Databases, http://www.ics.uci.edu/mlearnMLRepository.html , , Univ. of California at Irvine, Dept. of Information and Computer Science, 1998.
[4] G. Batista and M.C. Monard, “An Analysis of Four Missing Data Treatment Methods for Supervised Learning,” Applied Artificial Intelligence, vol. 17, pp. 519-533, 2003.
[5] X. Chai, L. Deng, Q. Yang, and C.X. Ling, “Test-Cost Sensitive Naíve Bayesian Classification,” Proc. Fourth IEEE Int'l Conf. Data Mining (ICDM04), 2004.
[6] P. Clark and T. Niblett, “The CN2 Induction Algorithm,” Machine Learning, vol. 3, pp. 261-283, 1989.
[7] P. Cheeseman and J. Stutz, “Bayesian Classification (AutoClass): Theory and Results,” Advances in Knowledge Discovery and Data Mining, U. Fayyad, G. Piatesky-Shapiro, P. Smyth, and R. Uthurusamy, eds., pp. 153-180, 1995.
[8] C.J. Date and H. Darwen, “The Default Values Approach to Missing Information,” Relational Database Writings 1989-1991, pp. 343-354, 1989.
[9] P. Domingos, “MetaCost: A General Method for Making Classifiers Cost-Sensitive,” Proc. Fifth Int'l Conf. Knowledge Discovery and Data Mining (KDD99), pp. 155-164, 1999.
[10] C. Elkan, “The Foundations of Cost-Sensitive Learning,” Proc. 17th Int'l Joint Conf. Artificial Intelligence (IJCAI01), pp. 973-978, 2001.
[11] U.M. Fayyad and K.B. Irani, “Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning,” Proc. 13th Int'l Joint Conf. Artificial Intelligence (IJCAI93), pp. 1022-1027, 1993.
[12] J. Friedman, Y. Yun, and R. Kohavi, “Lazy Decision Trees,” Proc. 13th Nat'l Conf. Artificial Intelligence, pp. 717-724, 1996.
[13] R. Greiner, A.J. Grove, and D. Roth, “Learning Cost-Sensitive Active Classifiers,” Artificial Intelligence, vol. 139, no. 2, pp. 137-174, 2002.
[14] M.T. Kai, “Inducing Cost-Sensitive Rrees via Instance Weighting,” Proc. Second European Symp. Principles of Data Mining and Knowledge Discovery, pp. 23-26, 1998.
[15] K. Lakshminarayan, S.A. Harp, and T. Samad, “Imputation of Missing Data in Industrial Databases,” Applied Intelligence, vol. 11, pp. 259-275, 1999.
[16] R.J. A. Little and D.B. Rubin, Statistical Analysis with Missing Data. New York: John Wiley, 1987.
[17] C.X. Ling, Q. Yang, J. Wang, and S. Zhang, “Decision Trees with Minimal Costs,” Proc. 21st Int'l Conf. Machine Learning (ICML04), 2004.
[18] M. Nunez, “The Use of Background Knowledge in Decision Tree Induction,” Machine Learning, vol. 6, pp. 231-250, 1991.
[19] J.R. Quinlan, “Unknown Attribute Values in Induction,” Proc. Sixth Int'l Workshop Machin Learning, A.M. Segre, ed., pp. 164-168, 1989.
[20] J.R. Quinlan, C4.5: Programs for Machine Learning. San Mateo, Calif.: Morgan Kaufmann, 1993.
[21] M. Tan, “Cost-Sensitive Learning of Classification Knowledge and Its Applications in Robotics,” Machine Learning, vol. 13, pp. 7-33, 1993.
[22] P.D. Turney, “Types of Cost in Inductive Concept Learning,” Proc. Workshop Cost-Sensitive Learning at the 17th Int'l Conf. Machine Learning, 2000.
[23] P.D. Turney, “Cost-Sensitive Classification: Empirical Evaluation of a Hybrid Genetic Decision Tree Induction Algorithm,” Artificial Intelligence Research, vol. 2, pp. 369-409, 1995.
[24] V.B. Zubek and T.G. Dietterich, “Pruning Improves Heuristic Search for Cost-Sensitive Learning,” Proc. 19th Int'l Conf. Machine Learning (ICML02), pp. 27-34, 2002.

Index Terms:
Index Terms- Induction, knowledge acquisition, machine learning.
Citation:
Shichao Zhang, Zhenxing Qin, Charles X. Ling, Shengli Sheng, ""Missing Is Useful': Missing Values in Cost-Sensitive Decision Trees," IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 12, pp. 1689-1693, Dec. 2005, doi:10.1109/TKDE.2005.188
Usage of this product signifies your acceptance of the Terms of Use.