Issue No. 05 - Sept.-Oct. (2012 vol. 9)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TCBB.2012.21
Blaise Hanczar , LIPADE, Univ. Paris Descartes, Paris, France
Avner Bar-Hen , MAP5, Univ. Paris Descartes, Paris, France
One of the major aims of many microarray experiments is to build discriminatory diagnosis and prognosis models. A large number of supervised methods have been proposed in literature for microarray-based classification for this purpose. Model evaluation and comparison is a critical issue and, the most of the time, is based on the classification cost. This classification cost is based on the costs of false positives and false negative, that are generally unknown in diagnostics problems. This uncertainty may highly impact the evaluation and comparison of the classifiers. We propose a new measure of classifier performance that takes account of the uncertainty of the error. We represent the available knowledge about the costs by a distribution function defined on the ratio of the costs. The performance of a classifier is therefore computed over the set of all possible costs weighted by their probability distribution. Our method is tested on both artificial and real microarray data sets. We show that the performance of classifiers is very depending of the ratio of the classification costs. In many cases, the best classifier can be identified by our new measure whereas the classic error measures fail.
probability, biology computing, genetics, genomics, lab-on-a-chip, pattern classification, real microarray data sets, classifier performance, gene expression data, discriminatory diagnosis, prognosis models, microarray-based classification, probability distribution, artificial microarray data sets, Error analysis, Cost function, Support vector machines, Bioinformatics, Computational biology, Measurement uncertainty, Training, gene expression., Classifier performance, supervised classification, microarray analysis
B. Hanczar and A. Bar-Hen, "A New Measure of Classifier Performance for Gene Expression Data," in IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 9, no. , pp. 1379-1386, 2012.