This Article 
 Bibliographic References 
 Add to: 
Class-Dependent Discretization for Inductive Learning from Continuous and Mixed-Mode Data
July 1995 (vol. 17 no. 7)
pp. 641-651

Abstract—Inductive learning systems can be effectively used to acquire classification knowledge from examples. Many existing symbolic learning algorithms can be applied in domains with continuous attributes when integrated with a discretization algorithm to transform the continuous attributes into ordered discrete ones. In this paper, a new information theoretic discretization method optimized for supervised learning is proposed and described. This approach seeks to maximize the mutual dependence as measured by the interdependence redundancy between the discrete intervals and the class labels, and can automatically determine the most preferred number of intervals for an inductive learning application. The method has been tested in a number of inductive learning examples to show that the class-dependent discretizer can significantly improve the classification performance of many existing learning algorithms in domains containing numeric attributes.

[1] I. Bratko and I. Kononenko,“Learning diagnostic rules from incomplete and noisy data,” Interactions in Artificial Intelligence and Statistical Methods, B. Phelps, ed., Hants: Technical Press, 1987.
[2] L. Breiman,J.H. Friedman,R.A. Olshen,, and C.J. Stone, , Classification and Regression Trees,Belmont, Calif: Wadsworth, 1984.
[3] T. Caelli and A. Pennington,“An improved rule generation method for evidence-based classification systems,” Pattern Recognition, vol. 26, no. 5, pp. 733-740, 1993.
[4] J. Catlett, "On Changing Continuous Attributes into Ordered Discrete Attributes," European Working Session on Learning, 1991.
[5] K.C.C. Chan,J.Y. Ching,, and A.K.C. Wong,“A probabilistic inductive learning approach to the acquisition of knowledge in medical expert systems,” Proc. Fifth IEEE Computer-Based Medical Systems Symp.,Durham, N.C., 1992.
[6] K.C.C. Chan,J.Y. Ching,, and A.K.C. Wong,“Learning system fault diagnostic rules: a probabilistic inference approach,” Proc. Conference on Artificial Intelligence Applications in Engineering 92,Waterloo, Canada, 1992.
[7] K.C.C. Chan and A.K.C. Wong,“APACS: a system for automated pattern analysis and classification,” Computational Intelligence, vol. 6, 1990.
[8] K.C.C. Chan and A.K.C. Wong,“A statistical technique for extracting classificatory knowledge from databases,” Knowledge Discovery in Databases, pp. 107-123, 1991.
[9] J.Y. Ching,“Class-dependent discretization of continuous attributes for inductive learning,” MASc Thesis, University of Waterloo, Canada, 1992.
[10] P. Clark and T. Niblett,“Induction in noisy domains,” Progress in Machine Learning: Proc. of EWSL 87, I. Bratko and N. Larvac, eds., Bled, Yugoslavia, 1987.
[11] R. Duda, P. Hart, and D. Stork, Pattern Classification. New York: John Wiley&Sons, 2001.
[12] U.M. Fayyad and K.B. Irani,“On the handling of continuous-valued attributes in decision tree generation,” Machine Learning, vol. 8, pp. 87-102, 1992.
[13] J.A Hartigan,Clustering Algorithms, John Wiley and Sons, New York, N.Y., 1975.
[14] R.S. Michalski,“Pattern recognition as rule-guided inductive inference,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 2, no. 4, pp. 349-361, 1980.
[15] R.S. Michalski , “A theory and methodology of inductive learning,” Machine Learning: An Artificial Intelligence Approach. vol. 1,R. Michalski, J. Carbonell, and T. Michell, eds., Los Altos, Calif., 1983.
[16] R.S. Michalski,I. Mozetic,J. Hong,, and N. Lavrac,“The AQ15 inductive learning system: an overview and experiments,” UIUCDCS-R-86-1260, Computer Science Department, University of Illinois at Urbana-Champaign, 1986.
[17] P.M. Murphy and D.W. Aha, UCI Repository of machine learning databases[Machine-readable data repository], .Irvine, Calif: University of California, Department of Information and Computer Science, 1991.
[18] C. Yang and S. Hasegawa, "FITNESS: Failure Immunization Technology for Network Services Survivability," Proc. IEEE GLOBECOM, pp. 1,549-1,554, 1988.
[19] J.R. Quinlan,"Induction of decision trees," Machine Learning, vol. 1, pp. 81-106, 1986.
[20] J.R. Quinlan,“Simplifying decision trees,” Int’l J. Man-Machine Studies, vol. 27, pp. 221-234, 1987.
[21] J.R. Quinlan,P.J. Compton,K.A. Horn, , and L. Lazarus,“Inductive knowledge acquisition: a case study,” Applications of Expert Systems, J.R. Quinlan, ed., pp. 157-173,Sydney, Australia: Addition-Wesley, 1987.
[22] G.M. Reaven and R.G. Miller,“An attempt to define the nature of chemical diabetes using a multidimensional analysis,” Diabetologia vol. 16, pp. 17-24, 1979.
[23] D.W. Stashuk and R.K. Naphan,“Probabilistic inference based classification applied to myoelectric signal decomposition,” IEEE Trans. Biomedical Engineering, June, 1992.
[24] A.K.C. Wong and D.K.Y. Chiu,“An event-covering method for effective probabilistic inference,” Pattern Recognition, vol. 20, no. 2, pp. 245-255, 1987.
[25] A.K.C. Wong and D.K.Y. Chiu,“Synthesizing statistical knowledge from incomplete mixed-mode data,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 9, no. 6, pp. 796-805, 1987.
[26] A.K.C. Wong and T.S. Liu,“Typicality, diversity and feature pattern of an Ensemble,” IEEE Trans. Computers, vol. 24, pp. 158-181, 1975.

Index Terms:
Inductive learning, classification, discretization, continuous attributes, mixed-mode attributes, maximum entropy, mutual information, uncertainty.
John Y. Ching, Andrew K. C. Wong, Keith C. C. Chan, "Class-Dependent Discretization for Inductive Learning from Continuous and Mixed-Mode Data," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 17, no. 7, pp. 641-651, July 1995, doi:10.1109/34.391407
Usage of this product signifies your acceptance of the Terms of Use.