Issue No. 08 - August (1998 vol. 20)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/34.709601
<p><b>Abstract</b>—Much of previous attention on decision trees focuses on the splitting criteria and optimization of tree sizes. The dilemma between overfitting and achieving maximum accuracy is seldom resolved. A method to construct a decision tree based classifier is proposed that maintains highest accuracy on training data and improves on generalization accuracy as it grows in complexity. The classifier consists of multiple trees constructed systematically by pseudorandomly selecting subsets of components of the feature vector, that is, trees constructed in randomly chosen subspaces. The subspace method is compared to single-tree classifiers and other forest construction methods by experiments on publicly available datasets, where the method's superiority is demonstrated. We also discuss independence between trees in a forest and relate that to the combined classification accuracy.</p>
Pattern recognition, decision tree, decision forest, stochastic discrimination, decision combination, classifier combination, multiple-classifier system, bootstrapping.
Tin Kam Ho, "The Random Subspace Method for Constructing Decision Forests", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 20, no. , pp. 832-844, August 1998, doi:10.1109/34.709601