This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2011 10th International Conference on Machine Learning and Applications and Workshops
Feature Selection Metric Using AUC Margin for Small Samples and Imbalanced Data Classification Problems
Honolulu, Hawaii USA
December 18-December 21
ISBN: 978-0-7695-4607-0
Feature selection helps us to address problems possessing high dimensionality, retaining only those features that are most important for the classification task. However, traditional feature selection methods fail to account for imbalanced class distributions, leading to poor predictions for minority class samples. Recently, there has been a growing interest around the Area Under ROC curve (AUC) metric due to the fact that it can provide meaningful performance measures in the presence of imbalanced data. In this paper, we propose a new margin-based feature selection metric that defines the quality of a set of features by considering the maximized AUC margin it induces during the process of learning with boosting. Our algorithm measures the cumulative effect each feature has on the margin distribution associated with the weighted linear combination that boosting produces over the positive and the negative examples. Experiments on various real imbalanced data sets show the effectiveness of our algorithm when faced with selecting informative features from small data possessing skewed class distributions.
Index Terms:
boosting, area under the ROC curve (AUC), feature election, margin
Citation:
Malak Alshawabkeh, Javed A. Aslam, Jennifer Dy, David Kaeli, "Feature Selection Metric Using AUC Margin for Small Samples and Imbalanced Data Classification Problems," icmla, vol. 1, pp.145-150, 2011 10th International Conference on Machine Learning and Applications and Workshops, 2011
Usage of this product signifies your acceptance of the Terms of Use.