Issue No. 02 - February (2009 vol. 31)
Daniel Hernández-Lobato , Universidad Autónoma de Madrid, Cantoblanco
Gonzalo Martínez-Muñoz , Universidad Autónoma de Madrid, Cantoblanco
Alberto Suárez , Escuela Politécnica Superior, Madrid
The global prediction of a homogeneous ensemble of classifiers generated in independent applications of a randomized learning algorithm on a fixed training set is analyzed within a Bayesian framework. Assuming that majority voting is used, it is possible to estimate with a given confidence level the prediction of the complete ensemble by querying only a subset of classifiers. For a particular instance that needs to be classified, the polling of ensemble classifiers can be halted when the probability that the predicted class will not change when taking into account the remaining votes is above the specified confidence level. Experiments on a collection of benchmark classification problems using representative parallel ensembles, such as bagging and random forests, confirm the validity of the analysis and demonstrate the effectiveness of the instance-based ensemble pruning method proposed.
Ensemble learning, bagging, random forests, ensemble pruning, instance-based pruning, Polya urn.
Daniel Hernández-Lobato, Gonzalo Martínez-Muñoz, Alberto Suárez, "Statistical Instance-Based Pruning in Ensembles of Independent Classifiers", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 31, no. , pp. 364-369, February 2009, doi:10.1109/TPAMI.2008.204