Brussels, Belgium Belgium
Dec. 10, 2012 to Dec. 13, 2012
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICDM.2012.143
Utilizing the concept of hypothesis margins to measure the quality of a set of features has been a growing line of research in the last decade. However, most previous algorithms have been developed under the large hypothesis margin principles of the 1-NN algorithm, such as Simba. Little attention has been paid so far to exploiting the hypothesis margins of boosting to evaluate features. Boosting is well known to maximize the training examples' hypothesis margins, in particular, the average margins which are known to be the first statistics that considers the whole margin distribution. In this paper, we describe how to utilize the training examples' mean margins of boosting to select features. A weight criterion, termed Margin Fraction (MF), is assigned to each feature that contributes to the average margin distribution combined in the final output produced by boosting. Applying the idea of MF to a sequential backward selection method, a new embedded selection algorithm is proposed, called SBS-MF. Experimentation is carried out using different data sets, which compares the proposed SBS-MF with two boosting based feature selection approaches, as well as to Simba. The results show that SBS-MF is effective in most of the cases.
average margin, Feature selection, boosting
Malak Alshawabkeh, Javed A. Aslam, Jennifer G. Dy, David Kaeli, "Feature Weighting and Selection Using Hypothesis Margin of Boosting", ICDM, 2012, 2013 IEEE 13th International Conference on Data Mining, 2013 IEEE 13th International Conference on Data Mining 2012, pp. 41-50, doi:10.1109/ICDM.2012.143