Issue No. 03 - June (1990 vol. 5)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/64.54672
<p>TDIDT (top-down induction of decision trees) methods for heuristic rule generation lead to unnecessarily complex representations of induced knowledge and are overly sensitive to noise in training data. Practical alternatives to TDIDT approaches which lead to more direct representations of the same knowledge, are examined. The alternatives are more immune to problems with spurious correlations in small data sets and to noise in initial training data. These knowledge representation problems and alternatives are examined in the context of chess, for which a TDIDT algorithm called the ID3 algorithm was originally devised. Modifications to the ID3 algorithm are proposed so that users can measure heuristically the information content of attributes to guide search. The program iteratively examines all positive instances remaining to be covered, along with negative training-set instances; search does not take place with irrelevant context restrictions. This algorithm is no more complex than TDIDT, just as fast and less sensitive to noise and it leads to clearer representations of the information present in training-set data.</p>
N. Gray, "Capturing Knowledge Through Top-Down Induction of Decision Trees," in IEEE Intelligent Systems, vol. 5, no. , pp. 41-50, 1990.