The authors present a statisticalheuristic feature selection criterion for constructing multibranching decision trees in noisy realworld domains. Real world problems often have multivalued features. To these problems, multibranching decision trees provide a more efficient and more comprehensible solution that binary decision trees. The authors propose a statisticalheuristic criterion, the symmetrical tau and then discuss its consistency with a Bayesian classifier and its builtin statistical test. The combination of a measure of proportionalreductioninerror and costofcomplexity heuristic enables the symmetrical tau to be a powerful criterion with many merits, including robustness to noise, fairness to multivalued features, and ability to handle a Boolean combination of logical features, and middlecut preference. The tau criterion also provides a natural basis for prepruning and dynamic error estimation. Illustrative examples are also presented.
