The Community for Technology Leaders
RSS Icon
Issue No.02 - February (2009 vol.31)
pp: 245-259
Gonzalo Martínez-Muñoz , Universidad Autónoma de Madrid, Cantoblanco
Daniel Hernández-Lobato , Universidad Autónoma de Madrid, Cantoblanco
Alberto Suárez , Escuela Politécnica Superior, Madrid
Several pruning strategies that can be used to reduce the size and increase the accuracy of bagging ensembles are analyzed. These heuristics select subsets of complementary classifiers that, when combined, can perform better than the whole ensemble. The pruning methods investigated are based on modifying the order of aggregation of classifiers in the ensemble. In the original bagging algorithm, the order of aggregation is left unspecified. When this order is random, the generalization error typically decreases as the number of classifiers in the ensemble increases. If an appropriate ordering for the aggregation process is devised, the generalization error reaches a minimum at intermediate numbers of classifiers. This minimum lies below the asymptotic error of bagging. Pruned ensembles are obtained by retaining a fraction of the classifiers in the ordered ensemble. The performance of these pruned ensembles is evaluated in several benchmark classification tasks under different training conditions. The results of this empirical investigation show that ordered aggregation can be used for the efficient generation of pruned ensembles that are competitive, in terms of performance and robustness of classification, with computationally more costly methods that directly select optimal or near-optimal subensembles.
Ensembles of classifiers, bagging, decision trees, ensemble selection, ensemble pruning, ordered aggregation.
Gonzalo Martínez-Muñoz, Daniel Hernández-Lobato, Alberto Suárez, "An Analysis of Ensemble Pruning Techniques Based on Ordered Aggregation", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.31, no. 2, pp. 245-259, February 2009, doi:10.1109/TPAMI.2008.78
[1] L.K. Hansen and P. Salamon, “Neural Network Ensembles,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 12, pp. 993-1001, 1990.
[2] A. Krogh and J. Vedelsby, “Neural Network Ensembles, Cross Validation, and Active Learning,” Advances in Neural Information Processing Systems, G. Tesauro, D. Touretzky, and T. Leen, eds., vol. 7, pp. 231-238, MIT Press, 1995.
[3] L.I. Kuncheva and C.J. Whitaker, “Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy,” Machine Learning, vol. 51, no. 2, pp. 181-207, May 2003.
[4] L. Breiman, “Bagging Predictors,” Machine Learning, vol. 24, no. 2, pp. 123-140, 1996.
[5] J.R. Quinlan, “Bagging, Boosting, and C4.5,” Proc. 13th Nat'l Conf. Artificial Intelligence, pp. 725-730, 1996.
[6] T.G. Dietterich, “An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization,” Machine Learning, vol. 40, no. 2, pp. 139-157, 2000.
[7] L. Breiman, “Random Forests,” Machine Learning, vol. 45, no. 1, pp.5-32, 2001.
[8] Y. Freund and R.E. Schapire, “A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting,” Proc. Second European Conf. Computational Learning Theory, pp. 23-37, 1995.
[9] Y. Freund and R.E. Schapire, “Experiments with a New Boosting Algorithm,” Proc. 13th Int'l Conf. Machine Learning, pp. 148-156, 1996.
[10] E. Bauer and R. Kohavi, “An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants,” Machine Learning, vol. 36, nos. 1-2, pp. 105-139, 1999.
[11] R. Caruana and A. Niculescu-Mizil, “An Empirical Comparison of Supervised Learning Algorithms,” Proc. 23rd Int'l Conf. Machine Learning, pp. 161-168, 2006.
[12] G. Rätsch, T. Onoda, and K.-R. Müller, “Soft Margins for AdaBoost,” Machine Learning, vol. 42, no. 3, pp. 287-320, 2001.
[13] D.D. Margineantu and T.G. Dietterich, “Pruning Adaptive Boosting,” Proc. 14th Int'l Conf. Machine Learning, pp. 211-218, 1997.
[14] A.L. Prodromidis and S.J. Stolfo, “Cost Complexity-Based Pruning of Ensemble Classifiers,” Knowledge and Information Systems, vol. 3, no. 4, pp. 449-469, 2001.
[15] Z.-H. Zhou, J. Wu, and W. Tang, “Ensembling Neural Networks: Many Could Be Better than All,” Artificial Intelligence, vol. 137, nos.1-2, pp. 239-263, 2002.
[16] Z.-H. Zhou and W. Tang, “Selective Ensemble of Decision Trees,” Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing, Q.Liu, Y. Yao, and A. Skowron, eds., pp. 476-483, Springer, 2003.
[17] G. Martínez-Muñoz and A. Suárez, “Aggregation Ordering in Bagging,” Proc. IASTED Int'l Conf. Artificial Intelligence and Applications, pp. 258-263, 2004.
[18] R. Caruana, A. Niculescu-Mizil, G. Crew, and A. Ksikes, “Ensemble Selection from Libraries of Models,” Proc. 21st Int'l Conf. Machine Learning, p. 18, 2004.
[19] R.E. Banfield, L.O. Hall, K.W. Bowyer, and W.P. Kegelmeyer, “Ensemble Diversity Measures and Their Application to Thinning,” Information Fusion, vol. 6, no. 1, pp. 49-62, 2005.
[20] G. Martínez-Muñoz and A. Suárez, “Pruning in Ordered Bagging Ensembles,” Proc. 23rd Int'l Conf. Machine Learning, pp. 609-616, 2006.
[21] G. Martínez-Muñoz and A. Suárez, “Using Boosting to Prune Bagging Ensembles,” Pattern Recognition Letters, vol. 28, no. 1, pp.156-165, 2007.
[22] Y. Zhang, S. Burer, and W.N. Street, “Ensemble Pruning via Semi-Definite Programming,” J. Machine Learning Research, vol. 7, pp.1315-1338, 2006.
[23] D. Hernández-Lobato, J.M. Hernández-Lobato, R. Ruiz-Torrubiano, and Á. Valle, “Pruning Adaptive Boosting Ensembles by Means of a Genetic Algorithm,” Proc. Seventh Int'l Conf. Intelligent Data Eng. and Automated Learning, E. Corchado, H. Yin, V.J. Botti, and C. Fyfe, eds., pp. 322-329, 2006.
[24] C. Tamon and J. Xiang, “On the Boosting Pruning Problem,” Proc. 11th European Conf. Machine Learning, R.L. de Mátaras and E.Plaza, eds., vol. 1810, pp. 404-412, 2000.
[25] G. Tsoumakas, I. Katakis, and I.P. Vlahavas, “Effective Voting of Heterogeneous Classifiers,” Proc. 11th European Conf. Machine Learning, J.-F. Boulicaut, F. Esposito, F. Giannotti, and D.Pedreschi, eds., vol. 3201, pp. 465-476, 2004.
[26] G. Tsoumakas, L. Angelis, and I. Vlahavas, “Selective Fusion of Heterogeneous Classifiers,” Intelligent Data Analysis, vol. 9, pp.511-525, 2005.
[27] I. Partalas, G. Tsoumakas, I. Katakis, and I.P. Vlahavas, “Ensemble Pruning Using Reinforcement Learning,” Proc. Fourth Helenic Conf. Advances in Artificial Intelligence, G. Antoniou, G. Potamias, C.Spyropoulos, and D. Plexousakis, eds., vol. 3955, pp. 301-310, 2006.
[28] J. Meynet and J.-P. Thiran, “Information Theoretic Combination of Classifiers with Application to Adaboost,” Proc. Seventh Int'l Workshop Multiple Classifier Systems, M. Haindl, J. Kittler, and F.Roli, eds., vol. 4472, pp. 171-179, 2007.
[29] W. Fan, F. Chu, H. Wang, and P.S. Yu, “Pruning and Dynamic Scheduling of Cost-Sensitive Ensembles,” Proc. 18th Nat'l Conf. Artificial Intelligence, pp. 146-151, 2002.
[30] T.K. Ho, J.J. Hull, and S.N. Srihari, “Decision Combination in Multiple Classifier Systems,” IEEE Trans. Pattern Analysis Machine Intelligence, vol. 16, no. 1, pp. 66-75, Jan. 1994.
[31] K. Woods, W.P. Kegelmeyer, and K.W. Bowyer, “Combination of Multiple Classifiers Using Local Accuracy Estimates,” IEEE Trans. Pattern Analysis Machine Intelligence, vol. 19, no. 4, pp. 405-410, Apr. 1997.
[32] A. Tsymbal and S. Puuronen, “Bagging and Boosting with Dynamic Integration of Classifiers,” Proc. Fourth European Conf. Principles of Data Mining and Knowledge Discovery, D.A. Zighed, H.J. Komorowski, and J.M. Zytkow, eds., pp. 116-125, 2000.
[33] G. Giacinto and F. Roli, “Dynamic Classifier Selection Based on Multiple Classifier Behaviour,” Pattern Recognition, vol. 34, no. 9, pp. 1879-1881, 2001.
[34] P. Domingos, “Knowledge Acquisition from Examples via Multiple Models,” Proc. 14th Int'l Conf. Machine Learning, pp. 98-106, 1997.
[35] L. Breiman, J.H. Friedman, R.A. Olshen, and C.J. Stone, Classification and Regression Trees. Chapman and Hall, 1984.
[36] C. Demir and E. Alpaydin, “Cost-Conscious Classifier Ensembles,” Pattern Recognition Letters, vol. 26, no. 14, pp. 2206-2214, 2005.
[37] A.M. Canuto, M.C. Abreu, L. de Melo Oliveira, J.C. Xavier Jr., and A. de M. Santos, “Investigating the Influence of the Choice of the Ensemble Members in Accuracy and Diversity of Selection-Based and Fusion-Based Methods for Ensembles,” Pattern Recognition Letters, vol. 28, pp. 472-486, 2007.
[38] G. Giacinto and F. Roli, “An Approach to the Automatic Design of Multiple Classifier Systems,” Pattern Recognition Letters, vol. 22, no. 1, pp. 25-33, 2001.
[39] B. Bakker and T. Heskes, “Clustering Ensembles of Neural Network Models,” Neural Networks, vol. 16, no. 2, pp. 261-269, 2003.
[40] R.E. Banfield, L.O. Hall, K.W. Bowyer, and W.P. Kegelmeyer, “A Comparison of Decision Tree Ensemble Creation Techniques,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 29, no. 1, pp. 173-180, Jan. 2007.
[41] C.E. Brodley and T. Lane, “Creating and Exploiting Coverage and Diversity,” Proc. AAAI Workshop Integrating Multiple Learned Models, pp. 8-14, 1996.
[42] A. Tsymbal, M. Pechenizkiy, and P. Cunningham, “Diversity in Search Strategies for Ensemble Feature Selection,” Information Fusion, vol. 6, no. 1, pp. 83-98, 2005.
[43] R. Schapire, Y. Freund, P. Bartlett, and W. Lee, “Boosting the Margin: A New Explanation for the Effectiveness of Voting Methods,” The Annals of Statistics, vol. 12, no. 5, pp. 1651-1686, 1998.
[44] L. Breiman, “Arcing the Edge,” technical report, Univ. of California, Berkeley, 1997.
[45] A. Asuncion and D. Newman, UCI Machine Learning Repository, , 2007.
[46] G. Martínez-Muñoz, D. Hernández-Lobato, and A. Suárez, “Selection of Decision Stumps in Bagging Ensembles,” Proc. 17th Int'l Conf. Artificial Neural Networks, J.M. de Sá, L.A. Alexandre, W.Duch, and D.P. Mandic, eds., pp. 319-328, 2007.
[47] A.E. Eiben and J.E. Smith, Introduction to Evolutionary Computing. Springer, 2003.
[48] C. Nadeau and Y. Bengio, “Inference for the Generalization Error,” Machine Learning, vol. 52, no. 3, pp. 239-281, 2003.
[49] J. Demšar, “Statistical Comparisons of Classifiers over Multiple Data Sets,” J. Machine Learning Research, vol. 7, pp. 1-30, 2006.
28 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool