This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Globally Optimal Fuzzy Decision Trees for Classification and Regression
December 1999 (vol. 21 no. 12)
pp. 1297-1311

Abstract—A fuzzy decision tree is constructed by allowing the possibility of partial membership of a point in the nodes that make up the tree structure. This extension of its expressive capabilities transforms the decision tree into a powerful functional approximant that incorporates features of connectionist methods, while remaining easily interpretable. Fuzzification is achieved by superimposing a fuzzy structure over the skeleton of a CART decision tree. A training rule for fuzzy trees, similar to backpropagation in neural networks, is designed. This rule corresponds to a global optimization algorithm that fixes the parameters of the fuzzy splits. The method developed for the automatic generation of fuzzy decision trees is applied to both classification and regression problems. In regression problems, it is seen that the continuity constraint imposed by the function representation of the fuzzy tree leads to substantial improvements in the quality of the regression and limits the tendency to overfitting. In classification, fuzzification provides a means of uncovering the structure of the probability distribution for the classification errors in attribute space. This allows the identification of regions for which the error rate of the tree is significantly lower than the average error rate, sometimes even below the Bayes misclassification rate.

[1] L. Zadeh, “Fuzzy Sets,” Information and Control, vol. 8, pp. 338-353, 1965.
[2] L. Zadeh, “Outline of a New Approach to the Analysis of Complex Systems and Decision Processes,” IEEE Trans. Systems, Man, and Cybernetics, vol. 3, pp. 28-44, 1973.
[3] R.L.P. Chang and T. Pavlidis, “Fuzzy Decision Tree Algorithms,” IEEE Trans. Systems, Man, and Cybernetics, no. 1, pp. 28-35, 1977.
[4] L.A. Zadeh, “Fuzzy Sets and Their Application to Classification and Clustering,” Classification and Clustering, J. van Ryzin, ed. New York: Academic Press, 1977.
[5] P.E. Maher and D.C. St. Clair, “Uncertain Reasoning in an ID3 Machine Learning Framework,” Proc. Second IEEE Int'l Conf. Fuzzy Systems, pp. 7-12, 1993.
[6] C.Z. Janickow, “Fuzzy Decision Trees: Issues and Methods,” IEEE Trans. Systems, Man, and Cybernetics B: Cybernetics, vol. 28, no. 1, pp. 1-14, 1998.
[7] A. Kandel, Fuzzy Techniques in Pattern Recognition. New York: Wiley-Interscience, 1982.
[8] T. Takagi and M. Sugeno, “Fuzzy Identification of Systems and Its Application to Modeling and Control,” IEEE Trans. Systems, Man, and Cybernetics, vol. 15, pp. 116-132, 1985.
[9] Fuzzy Model Identification: Selected Approaches, H. Hellendoorn and H.D. Driankov, eds. Berlin: Springer, 1997.
[10] N.K. Kasabov, Foundations of Neural Networks, Fuzzy Systems and Knowledge Engineering. Cambridge, Mass.: MIT Press, 1996.
[11] J. Shavlik, “Learning by Symbolic and Neural Methods,” The Handbook of Brain Theory and Neural Networks, M.A. Arbib ed., pp. 533-537, MIT Press, 1995.
[12] L. Breiman, J.H. Friedman, R.A. Olshen, and C.J. Stone, Classification and Regression Trees. New York: Chapman&Hall, 1984.
[13] J.S. Schuermann and W. Doster, “A Decision Theroretic Approach to Hierarchical Classifier Design,” Pattern Recognition, vol. 17, no. 3, pp. 359-369, 1984.
[14] J.R. Quinlan, “Decision Trees as Probabilistic Classifiers,” Proc. Fourth Int'l Workshop Machine Learning, pp. 31-37, Irvine, Calif., 1987.
[15] M.I. Jordan and R.A. Jacobs, “Hierarchical Mixtures of Experts and the EM Algorithm,” Neural Computation, vol. 6, pp. 181-214, 1994.
[16] I.K. Sethi, “Neural Implementation of Tree Classifiers,” IEEE Trans. Systems, Man, and Cybernetics, vol. 25, no. 8, pp. 1,243-1,249, 1995.
[17] Q.R. Wang and C.Y. Suen, “Large Tree Classifier with Heuristic Search and Global Training,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 9, no. 1, pp. 91-102, Jan. 1987.
[18] J. Jang, “Structure Determination in Fuzzy Modeling: A Fuzzy CART Approach,” Proc. IEEE Conf. Fuzzy Systems, pp. 480-485, 1994.
[19] I.K. Sethi, “Entropy Nets: From Decision Trees to Neural Networks,” Proc. IEEE, vol. 78, no. 10, pp. 1,605-1,613, 1990.
[20] I.K. Sethi and J.H. Yoo, “Structure-Driven Induction of Decision Tree Classifiers through Neural Learning,” Pattern Recognition, vol. 30, no. 11, pp. 1,893-1,904, 1997.
[21] Y. Park, “A Comparison of Neural Net Classifiers and Linear Tree Classifiers: Their Similarities and Differences,” Pattern Recognition, vol. 27, no. 11, pp. 1,493-1,503, 1994.
[22] J.R. Quinlan,"Induction of decision trees," Machine Learning, vol. 1, pp. 81-106, 1986.
[23] S. Gelfand,C. Ravishankar,, and E. Delp,“An iterative growing and pruning algorithm for classification tree design,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 13, no. 2, pp. 163-174, Feb. 1991.
[24] J.R. Quinlan, C4.5: Programs for Machine Learning,San Mateo, Calif.: Morgan Kaufman, 1992.
[25] F. Esposito, D. Malerba, and G. Semeraro, “A Comparative Analysis of Methods for Pruning Decision Trees,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 5, pp. 476-491, 1997.
[26] J. Hertz, A. Krogh, and R.G. Palmer, Introduction to the Theory of Neural Computation. Addison-Wesley, 1991.
[27] W.H. Press, S.A. Teukolsky, W.T. Vetterling, and B.P. Flannery, Numerical Recipes in C. Cambridge Univ. Press, 1992.
[28] V. Cherkassky and F. Mulier, “Statistical and Neural Network Techniques for Nonparametric Regression,” Selecting Models from Data, P. Cheeseman and R.W. Oldford, eds., pp. 383-392, New York: Springer-Verlag, 1994.
[29] J.H. Friedman, “Multi-Variate Adaptive Regression Splines,” Annals of Statistics, vol. 19, pp. 1-141, 1991.
[30] V. Cherkassky, D. Gehring, and F. Mulier, “Comparison of Adaptive Methods for Function Estimation from Samples,” IEEE Trans. Neural Networks, vol. 7, no. 4, pp. 969-984, 1996.
[31] V. Cherkassky, private communication.

Index Terms:
Automatic learning, decision trees, fuzzy set theory, global optimization, backpropagation, nonparametric regression, classification.
Citation:
Alberto Suárez, James F. Lutsko, "Globally Optimal Fuzzy Decision Trees for Classification and Regression," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 21, no. 12, pp. 1297-1311, Dec. 1999, doi:10.1109/34.817409
Usage of this product signifies your acceptance of the Terms of Use.