This Article 
 Bibliographic References 
 Add to: 
Multistrategy Ensemble Learning: Reducing Error by Combining Ensemble Learning Techniques
August 2004 (vol. 16 no. 8)
pp. 980-991

Abstract—Ensemble learning strategies, especially Boosting and Bagging decision trees, have demonstrated impressive capacities to improve the prediction accuracy of base learning algorithms. Further gains have been demonstrated by strategies that combine simple ensemble formation approaches. In this paper, we investigate the hypothesis that the improvement in accuracy of multistrategy approaches to ensemble learning is due to an increase in the diversity of ensemble members that are formed. In addition, guided by this hypothesis, we develop three new multistrategy ensemble learning techniques. Experimental results in a wide variety of natural domains suggest that these multistrategy ensemble learning techniques are, on average, more accurate than their component ensemble learning techniques.

[1] K. Ali, Learning Probabilistic Relational Concept Descriptions PhD thesis, Dept. of Information and Computer Science, Univ. of California, Irvine, 1996.
[2] E. Bauer and R. Kohavi, An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants Machine Learning, vol. 36, pp. 105-139, 1999.
[3] C. Blake and C.J. Merz, UCI Repository of Machine Learning Databases [Machine-readable data repository] Univ. of California, Dept. of Information and Computer Science, Irvine, 2001.
[4] L. Breiman, Bagging Predictors Machine Learning, vol. 24, pp. 123-140, 1996.
[5] L. Breiman, Bias, Variance, and Arcing Classifiers Technical Report 460, Statistics Dept., Univ. of California, Berkeley, 1996.
[6] L. Breiman, Arcing the Edge Technical Report 486, Statistics Dept., Univ. of California, Berkeley, 1997.
[7] L. Breiman, Arcing Classifiers The Annals of Statistics, vol. 26, no. 3, pp. 801-849, 1998.
[8] T.G. Dietterich, An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization Machine Learning, vol. 40, no. 2, pp. 139-158, 2000.
[9] T.G. Dietterich and E.B. Kong, Machine Learning Bias, Statistical Bias, and Statistical Variance of Decision Tree Algorithms technical report, Dept. of Computer Science, Oregon State Univ., 1995.
[10] Y. Freund and R.E. Schapire, A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting J. Computer and System Sciences, vol. 55, no. 1, pp. 119-139, 1997.
[11] J.H. Friedman, T. Hastie, and R. Tibshirani, Additive Logistic Regression: A Statistical View of Boosting Annals of Statistics, vol. 28, no. 2, pp. 337-374, 2000.
[12] R. Kohavi and D. Wolpert, Bias Plus Variance Decomposition for Zero-One Loss Functions Proc. 13th Int'l. Conf. Machine Learning, pp. 275-283, 1996.
[13] A. Krogh and J. Vedelsby, Neural Network Ensembles, Cross Validation, and Active Learning Advances in Neural Information Processing Systems, G. Tesauro, D. Touretzky, and T. Leen, eds., vol. 7, pp. 231-238, 1995.
[14] S. Lee and J.F. Elder, Bundling Heterogeneous Classifiers with Advisor Perceptrons Technical Report 97-1, Elder Research, Charlottesville, Va., 1997.
[15] D. Margineantu and T.G. Dietterich, Pruning Adaptive Boosting Proc. 14th Int'l. Conf. Machine Learning (ICML-97), pp. 211-218, 1997.
[16] J.R. Quinlan, C4.5: Programs for Machine Learning. San Mateo, Calif.: Morgan Kaufmann, 1993.
[17] J.R. Quinlan, Improved Use of Continuous Attributes in C4.5 J. Artificial Intelligence Research, vol. 4, pp. 77-90, 1996.
[18] R.E. Schapire, Y. Freund, P. Bartlett, and W.S. Lee, Boosting the Margin: A New Explanation for the Effectiveness of Voting Methods The Annals of Statistics, vol. 26, no. 5, pp. 1651-1686, Oct. 1998.
[19] V.N. Vapnik, Estimation of Dependencies Based on Empirical Data. Springer-Verlag, 1982.
[20] G.I. Webb, MultiBoosting: A Technique for Combining Boosting and Wagging Machine Learning, vol. 40, no. 2, pp. 159-196, 2000.
[21] Z. Zheng and G.I. Webb, Stochastic Attribute Selection Committees Proc. 11th Australian Joint Conf. Artificial Intelligence, pp. 321-332, 1998.

Index Terms:
Boosting, bagging, ensemble learning, committee learning, multiboosting, bias, variance, ensemble diversity.
Geoffrey I. Webb, Zijian Zheng, "Multistrategy Ensemble Learning: Reducing Error by Combining Ensemble Learning Techniques," IEEE Transactions on Knowledge and Data Engineering, vol. 16, no. 8, pp. 980-991, Aug. 2004, doi:10.1109/TKDE.2004.29
Usage of this product signifies your acceptance of the Terms of Use.