18th International Conference on Pattern Recognition (ICPR'06) Volume 4
Linear model combining by optimizing the Area under the ROC curve
Hong Kong
August 20-August 24
ISBN: 0-7695-2521-0
In some classification problems, like the detection of illnesses in patients, classes are very unbalanced and the misclassification costs for different classes vary significantly. Then it is better not to minimize the classification error, but to optimize the ordering of the data, or to optimize the Area under the ROC curve (AUC). In this paper we propose to optimize a linear combination of features (or base model outputs) by optimizing AUC. The advantages are that a relatively small training set is required for the optimization and that the training set can have a large class imbalance. Furthermore, the classifier does not make distributional assumptions, making it very suitable to combine the outputs of base classifiers. In the application of the detection of interstitial lung diseases it is shown to be very advantageous and to outperform standard classification rules.
Index Terms:
chest radiography, pattern recognition, combining classifiers, class imbalance, area under the ROC curve
Citation:
David M.J. Tax, Robert P.W. Duin, "Linear model combining by optimizing the Area under the ROC curve," icpr, vol. 4, pp.119-122, 18th International Conference on Pattern Recognition (ICPR'06) Volume 4, 2006