Issue No.10 - Oct. (2013 vol.25)
H. Altay Guvenir , Bilkent University, Ankara
Murat Kurtcephe , Case Western Reserve University, Cleveland
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2012.214
In recent years, the problem of learning a real-valued function that induces a ranking over an instance space has gained importance in machine learning literature. Here, we propose a supervised algorithm that learns a ranking function, called ranking instances by maximizing the area under the ROC curve (RIMARC). Since the area under the ROC curve (AUC) is a widely accepted performance measure for evaluating the quality of ranking, the algorithm aims to maximize the AUC value directly. For a single categorical feature, we show the necessary and sufficient condition that any ranking function must satisfy to achieve the maximum AUC. We also sketch a method to discretize a continuous feature in a way to reach the maximum AUC as well. RIMARC uses a heuristic to extend this maximization to all features of a data set. The ranking function learned by the RIMARC algorithm is in a human-readable form; therefore, it provides valuable information to domain experts for decision making. Performance of RIMARC is evaluated on many real-life data sets by using different state-of-the-art algorithms. Evaluations of the AUC metric show that RIMARC achieves significantly better performance compared to other similar methods.
Training, Nickel, Algorithm design and analysis, Machine learning algorithms, Machine learning, Measurement, Training data, machine learning, Training, Nickel, Algorithm design and analysis, Machine learning algorithms, Machine learning, Measurement, Training data, decision support, Ranking, data mining
H. Altay Guvenir, Murat Kurtcephe, "Ranking Instances by Maximizing the Area under ROC Curve", IEEE Transactions on Knowledge & Data Engineering, vol.25, no. 10, pp. 2356-2366, Oct. 2013, doi:10.1109/TKDE.2012.214