Comparison of Two Families of Entropy-Based Classification Measures with and without Feature Selection
Proceedings of the 34th Annual Hawaii International Conference on System Sciences (2001)
Jan. 3, 2001 to Jan. 6, 2001
Many decision tree (DT) induction algorithms, including the popular C4.5 family, are based on the Conditional Entropy (CE) measure family. An interesting question involves the relative performance of other entropy measure families such as Class-Attribute Mutual Information (CAMI). We therefore conducted a theoretical analysis of the CAMI family that enabled us to expose relationships with CE and correct a previous CAMI result. Our computational study showed that there was only a small variation in the performance of the two families. Since feature selection is important in DT induction, we conducted a theoretical analysis of a recently published blurring-based feature selection algorithm and developed a new feature selection algorithm. We tested this algorithm on a wider set of test problems than in the comparable study in order to identify benefits and limitations of blurring-based feature selection. These results provide theoretical and computational insight into entropy-based induction measures and feature selection algorithms.
Q. Weng, K. Bryson and K. Giles, "Comparison of Two Families of Entropy-Based Classification Measures with and without Feature Selection," Proceedings of the 34th Annual Hawaii International Conference on System Sciences(HICSS), Maui, Hawaii, 2001, pp. 3014.