This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
A Stochastic Downhill Search Algorithm for Estimating the Local False Discovery Rate
July-September 2004 (vol. 1 no. 3)
pp. 98-108
Screening for differential gene expression in microarray studies leads to difficult large-scale multiple testing problems. The local false discovery rate is a statistical concept for quantifying uncertainty in multiple testing. In this paper, we introduce a novel estimator for the local false discovery rate that is based on an algorithm which splits all genes into two groups, representing induced and noninduced genes, respectively. Starting from the full set of genes, we successively exclude genes until the gene-wise p{\hbox{-}}{\rm values} of the remaining genes look like a typical sample from a uniform distribution. In comparison to other methods, our algorithm performs compatibly in detecting the shape of the local false discovery rate and has a smaller bias with respect to estimating the overall percentage of noninduced genes. Our algorithm is implemented in the Bioconductor compatible R package TWILIGHT version 1.0.1, which is available from http://compdiag.molgen.mpg.de/software or from the Bioconductor project at http://www.bioconductor.org.

[1] Affymetrix Inc., “Microarray Suite User's Guide,” version 5.0, http://www.affymetrix.com/support/technical manuals.affx, 2001.
[2] D.B. Allison , G.L. Gadbury , M. Heo , J.R. Fernández , C.-K. Lee , T.A. Prolla , and R. Weindruch , “A Mixture Model Approach for the Analysis of Microarray Gene Expression Data,” Computational Statistics and Data Analysis, vol. 39, pp. 1-20, 2002.
[3] Y. Benjamini and Y. Hochberg , “Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing,” J. Royal Statistical Soc. B, vol. 57, no. 1, pp. 289-300, 1995.
[4] Y. Benjamini and D. Yekutieli , “The Control of the False Discovery Rate in Multiple Testing under Dependency,” Annals of Statistics, vol. 29, no. 4, pp. 1165-1188, 2001.
[5] K.-A. Do , P. Müller , and F. Tang , “A Bayesian Mixture Model for Differential Gene Expression,” Dept. of Biostatistics, Univ. of Texas, http://odin.mdacc.tmc.edu/kimbayesmix/, 2003.
[6] S. Dudoit , J.P. Shaffer , and J.C. Boldrick , “Multiple Hypothesis Testing in Microarray Experiments,” Division of Biostatistics Working Paper Series, Univ. of California at Berkeley, no. 110, 2002.
[7] B. Efron , R. Tibshirani , J.D. Storey , and V. Tusher , “Empirical Bayes Analysis of a Microarray Experiment,” J. Am. Statistical Assoc., vol. 96, no. 456, pp. 1151-1160, 2001.
[8] E. Ferkingstad , M. Langaas , and B. Lindqvist , “Estimating the Proportion of True Null Hypotheses, with Application to DNA Microarray Data,” Preprint Series in Statistics, no. 4, Dept. of Mathematical Sciences, Norwegian Univ. of Science and Technology, http://www.math.ntnu.no/preprint/statistics 2003/, 2003.
[9] H. Finner and M. Roters , “On the False Discovery Rate and Expected Type I Errors,” Biometrical J., vol. 43, no. 8, pp. 985-1005, 2001.
[10] C. Genovese and L. Wasserman , “Bayesian and Frequentist Multiple Testing,” Bayesian Statistics 7— Proc. Seventh Valencia Int'l Meeting, J.M. Bernardo, A.P. Dawid, J.O. Berger, M. West, D. Heckerman, M.J. Bayarri, and A.F.M. Smith, eds. Oxford Univ. Press, 2003.
[11] C. Genovese and L. Wasserman , “A Stochastic Process Approach to False Discovery Control,” Annals of Statistics, vol. 32, no. 3, pp. 1035-1061, 2004.
[12] W. Huber , A. von Heydebreck , H. Sültmann , A. Poustka , and M. Vingron , “Variance Stabilization Applied to Microarray Data Calibration and to the Quantification of Differential Expression,” Bioinformatics, vol. 18, suppl. 1, pp. S96-S104, 2002.
[13] R.A. Irizarry , B.M. Bolstad , F. Collin , L.M. Cope , B. Hobbs , and T.P. Speed , “Summaries of Affymetrix GeneChip Probe Level Data,” Nucleic Acids Research, vol. 31, no. 4,e15, 2003.
[14] J.G. Liao , Y. Lin , Z.E. Selvanayagam , and W.J. Shih , “A Mixture Model for Estimating the Local False Discovery Rate in DNA Microarray Analysis,” Bioinformatics, vol. 20, no. 16, pp. 2694-2701, 2004.
[15] S. Pounds , “User's Guide to BUM Library Version 1-1,” St. Jude Children's Research Hospital Memphis, http://www. stjuderesearch.org/statistics BUM/, 2003.
[16] S. Pounds and C. Cheng , “Improving False Discovery Rate Estimation,” Bioinformatics, vol. 20, no. 11, pp. 1737-1745, 2004.
[17] S. Pounds and S.W. Morris , “Estimating the Occurrence of False Positives and False Negatives in Microarray Studies by Approximating and Partitioning the Empirical Distribution of $p{\hbox{-}}{\rm Values}$ ,” Bioinformatics, vol. 19, no. 10, pp. 1236-1242, 2003.
[18] A. Reiner , D. Yekutieli , and Y. Benjamini , “Identifying Differentially Expressed Genes Using False Discovery Rate Controlling Procedures,” Bioinformatics, vol. 19, no. 3, pp. 368-375, 2003.
[19] S. Scheid and R. Spang , “A False Discovery Rate Approach to Separate the Score Distributions of Induced and Noninduced Genes,” Proc. Third Int'l Workshop Distributed Statistical Computing, http://www.ci.tuwien.ac.at/Conferences/DSC-2003 Proceedings/, 2003.
[20] G.K. Smyth , Y.-H. Yang , and T.P. Speed , “Statistical Issues in Microarray Data Analysis,” Functional Genomics: Methods and Protocols, Methods in Molecular Biology, M.J. Brownstein and A.B. Khodursky, eds., vol. 224, pp. 111-136, 2003.
[21] J.D. Storey , “The Positive False Discovery Rate: A Bayesian Interpretation and the $q{\hbox{-}}{\rm Value}$ ,” Annals of Statistics, vol. 31, no. 6, pp. 2013-2035, 2003.
[22] J.D. Storey and R. Tibshirani , “Statistical Significance for Genomewide Studies,” Proc. Nat'l Academy of Sciences, vol. 100, no. 16, pp. 9440-9445, 2003.
[23] C.-A. Tsai , H.-M. Hsueh , and J.J. Chen , “Estimation of False Discovery Rates in Multiple Testing: Application to Gene Microarray Data,” Biometrics, vol. 59, no. 4, pp. 1071-1081, 2003.
[24] V. Tusher , R. Tibshirani , and C. Chu , “Significance Analysis of Microarrays Applied to Ionizing Radiation Response,” Proc. Nat'l Academy of Sciences, vol. 98, no. 9, pp. 5116-5121, 2001.
[25] E.-J. Yeoh , M.E. Ross , S.A. Shurtleff , W.K. Williams , D. Patel , R. Mahfouz , F.G. Behm , S.C. Raimondi , M.V. Relling , A. Patel , C. Cheng , D. Campana , D. Wilkins , X. Zhou , J. Li , H. Liu , C.-H. Pui , W.E. Evans , C. Naeve , L. Wong , and J.R. Downing , “Classification, Subtype Discovery, and Prediction of Outcome in Pediatric Acute Lymphoblastic Leukemia by Gene Expression Profiling,” Cancer Cell, vol. 1, pp. 133-143, 2002.

Index Terms:
Local false discovery rates, stochastic search algorithms, microarray analysis, biology and genetics.
Citation:
Stefanie Scheid, Rainer Spang, "A Stochastic Downhill Search Algorithm for Estimating the Local False Discovery Rate," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 1, no. 3, pp. 98-108, July-Sept. 2004, doi:10.1109/TCBB.2004.24
Usage of this product signifies your acceptance of the Terms of Use.