Third International Conference on Information Technology and Applications (ICITA'05) Volume 1
Multivariate Interdependent Discretization for Continuous Attribute
Sydney, Australia
July 04-July 07
ISBN: 0-7695-2316-1
Decision tree is one of the most widely used and practical methods in the data mining and machine learning discipline. However, many discretization algorithms developed in this field focus on univariate only, which is inadequate to handle the critical problems especially owned by medical domain. In this paper, we propose a new multivariate discretization method called Multivariate Interdependent Discretization for Continuous Attributes — MIDCA. Our novel algorithm can minimize the uncertainty between the interdependent attribute and the continuous-valued attribute, and at the same time to maximize their correlation. The empirical results demonstrate a comparison of performance of various decision tree algorithms on twelve real-life datasets from UCI repository.
Index Terms:
Multivariate Discretization, Interdependent, Correlated Attribute, Data Mining, Machine Learning
Citation:
Sam Chao, Yiping Li, "Multivariate Interdependent Discretization for Continuous Attribute," icita, vol. 1, pp.167-172, Third International Conference on Information Technology and Applications (ICITA'05) Volume 1, 2005