The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.03 - July-September (2008 vol.5)
pp: 423-431
ABSTRACT
A principal goal of microarray studies is to identify the genes showing differential expression under distinct conditions. In such studies, the selection of an optimal test statistic is a crucial challenge, which depends on the type and amount of data under analysis. While previous studies on simulated or spike-in datasets do not provide practical guidance on how to choose the best method for a given real dataset, we introduce an enhanced reproducibility-optimization procedure, which enables the selection of a suitable gene- anking statistic directly from the data. In comparison with existing ranking methods, the reproducibilityoptimized statistic shows good performance consistently under various simulated conditions and on Affymetrix spike-in dataset. Further, the feasibility of the novel statistic is confirmed in a practical research setting using data from an in-house cDNA microarray study of asthma-related gene expression changes. These results suggest that the procedure facilitates the selection of an appropriate test statistic for a given dataset without relying on a priori assumptions, which may bias the findings and their interpretation. Moreover, the general reproducibilityoptimization procedure is not limited to detecting differential expression only but could be extended to a wide range of other applications as well.
INDEX TERMS
Microarray, gene expression, gene ranking, reproducibility, differential expression, bootstrap
CITATION
Laura L. Elo, Sanna Filén, Riitta Lahesmaa, Tero Aittokallio, "Reproducibility-Optimized Test Statistic for Ranking Genes in Microarray Studies", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.5, no. 3, pp. 423-431, July-September 2008, doi:10.1109/tcbb.2007.1078
REFERENCES
[1] T. Aittokallio, M. Kurki, O. Nevalainen, T. Nikula, A. West, and R. Lahesmaa, “Computational Strategies for Analyzing Data in Gene Expression Microarray Experiments,” J. Bioinformatics and Computational Biology, vol. 1, no. 3, pp. 541-586, Oct. 2003.
[2] D.B. Allison, X. Cui, G.P. Page, and M. Sabripour, “Microarray Data Analysis: From Disarray to Consolidation and Consensus,” Nature Rev. Genetics, vol. 7, no. 1, pp. 55-65, Jan. 2006.
[3] P. Broberg, “Statistical Methods for Ranking Differentially Expressed Genes,” Genome Biology, vol. 4, no. 6, p. R41, May 2003.
[4] P. Broberg, “A Comparative Review of Estimates of the Proportion Unchanged Genes and the False Discovery Rate,” BMC Bioinformatics, vol. 6, p. 199, Aug. 2005.
[5] J. Comander, S. Natarajan, M.A. Gimbrone Jr., and G. Garcia-Cardena, “Improving the Statistical Detection of Regulated Genes from Microarray Data Using Intensity-Based Variance Estimation,” BMC Genomics, vol. 5, no. 1, p. 17, Feb. 2004.
[6] L.M. Cope, R.A. Irizarry, H.A. Jaffee, Z. Wu, and T.P. Speed, “A Benchmark for Affymetrix GeneChip Expression Measures,” Bioinformatics, vol. 20, no. 3, pp. 323-331, Feb. 2004.
[7] C. Genest and J.F. Plante, “On Blest's Measure of Rank Correlation,” Canadian J. Statististics, vol. 31, no. 1, pp. 35-52, 2003.
[8] R.C. Gentleman et al., “Bioconductor: Open Software Development for Computational Biology and Bioinformatics,” Genome Biology, vol. 5, no. 10, p. R80, Sept. 2004.
[9] R.A. Irizarry, B. Hobbs, F. Collin, Y.D. Beazer-Barclay, K.J. Antonellis, U. Scherf, and T.P. Speed, “Exploration, Normalization, and Summaries of High Density Oligonucleotide Array Probe Level Data,” Biostatistics, vol. 4, no. 2, pp. 249-264, Apr. 2003.
[10] R.A. Irizarry, Z. Wu, and H.A. Jaffee, “Comparison of Affymetrix GeneChip Expression Measures,” Bioinformatics, vol. 22, no. 7, pp.789-794, Apr. 2006.
[11] H. Kim, G.H. Golub, and H. Park, “Missing Value Estimation for DNA Microarray Gene Expression Data: Local Least Squares Imputation,” Bioinformatics, vol. 21, no. 2, pp. 187-198, Jan. 2005.
[12] R.D. Kim and P.J. Park, “Improving Identification of Differentially Expressed Genes in Microarray Studies Using Information from Public Databases,” Genome Biology, vol. 5, no. 9, p. R70, Aug. 2004.
[13] I. Lönnstedt and T. Speed, “Replicated Microarray Data,” Statistica Sinica, vol. 12, pp. 31-46, 2002.
[14] R. Lund, “Identification of Novel Genes Involved in the Early Differentiation of Th1 and Th2 Cells,” PhD dissertation, Ann. Univ. Turkuensis D 602, 2004.
[15] T. Mehta, M. Tanik, and D.B. Allison, “Towards Sound Epistemological Foundations of Statistical Methods for High-Dimensional Biology,” Nature Genetics, vol. 36, no. 9, pp. 943-947, Sept. 2004.
[16] S. Mukherjee, S.J. Roberts, and M.J. van der Laan, “Data-Adaptive Test Statistics for Microarray Data,” Bioinformatics, vol. 21, no. 2, pp. ii108-ii114, Sept. 2005.
[17] T. Nikula, A. West, M. Katajamaa, T. Lönnberg, R. Sara, T. Aittokallio, O.S. Nevalainen, and R. Lahesmaa, “A Human ImmunoChip cDNA Microarray Provides a Comprehensive Tool to Study Immune Responses,” J. Immunological Methods, vol. 303, nos. 1-2, pp. 122-134, Aug. 2005.
[18] P. Pavlidis, Q. Li, and W.S. Noble, “The Effect of Replication on Gene Expression Microarray Experiments,” Bioinformatics, vol. 19, no. 13, pp. 1620-1627, Sept. 2003.
[19] M.S. Pepe, G. Longton, G.L. Anderson, and M. Schummer, “Selecting Differentially Expressed Genes from Microarray Experiments,” Biometrics, vol. 59, no. 1, pp. 133-142, Mar. 2003.
[20] L.X. Qin and K.F. Kerr, “Empirical Evaluation of Data Transformations and Ranking Statistics for Microarray Analysis,” Nucleic Acids Research, vol. 32, no. 18, pp. 5471-5479, Oct. 2004.
[21] G.K. Smyth, “Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments,” Statistical Applications in Genetics and Molecular Biology, vol. 3, no. 1, Feb. 2004.
[22] V.G. Tusher, R. Tibshirani, and G. Chu, “Significance Analysis of Microarrays Applied to the Ionizing Radiation Response,” Proc. Nat'l Academy of Sciences, vol. 98, no. 9, pp. 5116-5121, Apr. 2001.
[23] R. Xu and X. Li, “A Comparison of Parametric versus Permutation Methods with Applications to General and Temporal Microarray Gene Expression Data,” Bioinformatics, vol. 19, no. 10, pp. 1284-1289, July 2003.
[24] Y.H. Yang, S. Dudoit, P. Luu, D.M. Lin, V. Peng, J. Ngai, and T.P. Speed, “Normalization for cDNA Microarray Data: A Robust Composite Method Addressing Single and Multiple Slide Systematic Variation,” Nucleic Acids Research, vol. 30, no. 4, p. e15, Feb. 2002.
29 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool