The Community for Technology Leaders
RSS Icon
Subscribe
Brussels, Belgium Belgium
Dec. 10, 2012 to Dec. 10, 2012
ISBN: 978-1-4673-5164-5
pp: 49-56
ABSTRACT
Developing accurate, reliable and easy to use diagnostic tests is based upon identifying a small set of highly discriminative biomarkers. This task can be cast as feature selection within a pattern recognition context. Medical data are usually of the "wide" type where the number of features is substantially larger than the number of instances. With the abundance of feature ranking methods, it is difficult to pick the most suitable one and choose a final consistent feature subset. Ensembles of ranking methods have been recommended for the task but their stability and accuracy have not been evaluated across different ranking methods. Here we present a case study consisting of 429 samples of exhaled air from smokers, 83% of whom suffer from COPD (chronic obstructive pulmonary disease). The task is to identify a discriminative subset of the 1929 volatile organic compounds (VOCs) measured through mass spectrometry. Using Pareto analysis, 16 feature ranking ensembles were evaluated with respect to three criteria: classification accuracy, area under the ROC curve and the stability of the ensemble selection. The t-statistic was rated the best among the 16 feature rankers, outperforming the currently favourite SVM ranker.
INDEX TERMS
Support vector machines, Stability criteria, Accuracy, Indexes, Educational institutions, Vegetation, COPD, Feature selection, feature ranking, classifier ensembles, stability index
CITATION
Ludmila I. Kuncheva, Christopher J. Smith, Yasir Syed, Christopher O. Phillips, Keir E. Lewis, "Evaluation of Feature Ranking Ensembles for High-Dimensional Biomedical Data: A Case Study", ICDMW, 2012, 2013 IEEE 13th International Conference on Data Mining Workshops, 2013 IEEE 13th International Conference on Data Mining Workshops 2012, pp. 49-56, doi:10.1109/ICDMW.2012.12
17 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool