Issue No. 04 - October-December (2010 vol. 7)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TCBB.2008.138
Great concerns have been raised about the reproducibility of gene signatures based on high-throughput techniques such as microarray. Studies analyzing similar samples often report poorly overlapping results, and the p-value usually lacks biological context. We propose a nonparametric ReDiscovery Curve (RDCurve) method, to estimate the frequency of rediscovery of gene signature identified. Given a ranking procedure and a data set with replicated measurements, the RDCurve bootstraps the data set and repeatedly applies the ranking procedure, selects a subset of k important genes, and estimates the probability of rediscovery of the selected subset of genes. We also propose a permutation scheme to estimate the confidence band under the Null hypothesis for the significance of the RDCurve. The method is nonparametric and model-independent. With the RDCurve, we can assess the signal-to-noise ratio of the data, compare the performance of ranking procedures in term of their expected rediscovery rates, and choose the number of genes to be reported.
Stability, Testing, RNA, Statistical analysis, Statistical distributions, Reproducibility of results, Frequency estimation, Biomedical measurements, Statistics, Genetics
"RDCurve: A Nonparametric Method to Evaluate the Stability of Ranking Procedures", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 7, no. , pp. 719-726, October-December 2010, doi:10.1109/TCBB.2008.138