This Article 
 Bibliographic References 
 Add to: 
A General Framework for Analyzing Data from Two Short Time-Series Microarray Experiments
January-February 2011 (vol. 8 no. 1)
pp. 14-26
Mohak Shah, McGill University, Montreal
Jacques Corbeil, Laval University, Quebec
We propose a general theoretical framework for analyzing differentially expressed genes and behavior patterns from two homogenous short time-course data. The framework generalizes the recently proposed Hilbert-Schmidt Independence Criterion (HSIC)-based framework adapting it to the time-series scenario by utilizing tensor analysis for data transformation. The proposed framework is effective in yielding criteria that can identify both the differentially expressed genes and time-course patterns of interest between two time-series experiments without requiring to explicitly cluster the data. The results, obtained by applying the proposed framework with a linear kernel formulation, on various data sets are found to be both biologically meaningful and consistent with published studies.

[1] M.N. Arbeitman, E. Furlong, F.J. Imam, E. Johnson, B.H. Null, B.S. Baker, M.A. Krasnow, M.P. Scott, R.W. Davis, and K.P. White, "Gene Expression during the Life Cycle of Drosophila Melanogaster," Science, vol. 297, pp. 2270-2275, 2002.
[2] T. Banno, M. Adachi, L. Mukkamala, and M. Blumenberg, "Unique Keratinocyte-Specific Effects of Interferon-Gamma That Protect Skin from Viruses, Identified Using Transcriptional Profiling," Antiviral Therapy, vol. 8, no. 6, pp. 541-554, 2003.
[3] Z. Bar-Joseph, G. Gerber, I. Simon, D.K. Gifford, and T.S. Jaakkola, "Comparing the Continuous Representation of Time-Series Expression Profiles to Identify Differentially Expressed Genes," Proc. Nat'l Academy of Sciences USA, vol. 100, no. 18, pp. 10146-10151, 2003.
[4] T. Barrett, D.B. Troup, S.E. Wilhite, P. Ledoux, D. Rudnev, C. Evangelista, I.F. Kim, A. Soboleva, M. Tomashevsky, and R. Edgar, "NCBI GEO: Mining Tens of Millions of Expression Profiles-Database and Tools Update," Nucleic Acids Research, vol. 35, no. database issue, pp. D760-D765, 2007.
[5] J. Bedo, C. Sanderson, and A. Kowalczyk, "An Efficient Alternative to svm Based Recursive Feature Elimination with Applications in Natural Language Processing and Bioinformatics," Proc. Australian Conf. Artificial Intelligence, pp. 170-170, 2006.
[6] B.E. Boser, I.M. Guyon, and V.N. Vapnik, "A Training Algorithm for Optimal Margin Classifiers," Proc. Fifth Ann. ACM Workshop Computational Learning Theory, pp. 144-152, 1992.
[7] M.J. Brauer, A.J. Saldanha, K. Dolinski, and D. Botstein, "Homeostatic Adjustment and Metabolic Remodeling in Glucose-Limited Yeast Cultures," Molecular Biology of the Cell, vol. 16, no. 5, pp. 2503-2517, 2005.
[8] L. Ein-Dor, O. Zuk, and E. Domany, "Thousands of Samples Are Needed to Generate a Robust Gene List for Predicting Outcome in Cancer," Proc. Nat'l Academy of Sciences USA, vol. 103, pp. 5923-5928, 2006.
[9] R. Edgar, M. Domrachev, and A.E. Lash, "Gene Expression Omnibus: NCBI Gene Expression and Hybridization Array Data Repository," Nucleic Acids Research, vol. 30, no. 1, pp. 207-210, 2002.
[10] J. Ernst and Z. Bar-Joseph, "Stem: A Tool for the Analysis of Short Time Series Gene Expression Data," BMC Bioinformatics, vol. 7, article no. 191, 2006.
[11] J. Ernst, G.J. Nau, and Z. Bar-Joseph, "Clustering Short Time Series Gene Expression Data," Bioinformatics, vol. 21, no. 1, pp. i159-i168, 2005.
[12] C. Faloutsos, T.G. Kolda, and J. Sun, "Mining Large Time-Evolving Data Using Matrix and Tensor Tools," Proc. Tutorial at the 24th Int'l Conf. Machine Learning (ICML '07), /, 2007.
[13] T.S. Furey, N. Cristianini, N. Duffy, D.W. Bednarski, M. Schummer, and D. Haussler, "Support Vector Machine Classification and Validation of Cancer Tissue Samples Using Microarray Expression Data," Bioinformatics, vol. 16, pp. 906-914, 2000.
[14] A.P. Gasch, P.T. Spellman, C.M. Kao, O. Carmel-Harel, M.B. Eisen, G. Storz, D. Botstein, and P.O. Brown, "Genomic Expression Programs in the Response of Yeast Cells to Environmental Changes," Molecular Biology of the Cell, vol. 11, no. 12, pp. 4241-4257, Dec. 2000.
[15] T.R. Golub, D.K. Slonim, P. Tamayo, C. Huard, M. Gaasenbeek, J.P. Mesirov, H. Coller, M.L. Loh, J.R. Downing, M.A. Caligiuri, C.D. Bloomfield, and E.S. Lander, "Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring," Science, vol. 286, no. 5439, pp. 531-537, 1999.
[16] A. Gretton, O. Bousquet, A. Smola, and B. Schoelkopf, "Measuring Statistical Dependence with Hilbert-Schmidt Norms," Proc. 16th Int'l Conf. Algorithmic Learning Theory, pp. 63-77, 2005.
[17] K. Guillemin, N. Salama, L. Tompkins, and S. Falkow, "Cag Pathogenicity Island-Specific Responses of Gastric Epithelial Cells to Helicobacter Pylori Infection," Proc. Nat'l Academy of Sciences USA, vol. 99, no. 23, pp. 15136-15141, 2002.
[18] I. Guyon, J. Weston, S. Barnhill, and V. Vapnik, "Gene Selection for Cancer Classification Using Support Vector Machines," Machine Learning, vol. 46, pp. 389-422, 2002.
[19] T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning. Springer, 2001.
[20] N.A. Heard, C.C. Holmes, D.A. Stephens, D.J. Hand, and G. Dimopoulos, "Bayesian Coclustering of Anopheles Gene Expression Time Series: Study of Immune Defense Response to Multiple Experimental Challenges," Proc. Nat'l Academy of Sciences USA, vol. 102, no. 47, pp. 16939-16944 , 2005.
[21] R.A. Horn and C.R. Johnson, Topics in Matrix Analysis. Cambridge Univ. Press, 1994.
[22] J. Kim and J.H. Kim, "Difference-Based Clustering of Short Time-Course Microarray Data with Replicates," BMC Bioinformatics, vol. 8, article no. 253, 2007.
[23] X. Leng and H.G. Müeller, "Time Ordering of Gene Co-Expression," Biostatistics, vol. 7, pp. 569-584, 2006.
[24] Y. Luan and H. Li, "Clustering of Time-Course Gene Expression Data Using a Mixed-Effects Model with B-Splines," Bioinformatics, vol. 19, no. 4, pp. 474-482, 2003.
[25] P. Ma, C.I. Castillo-Davis, W. Zhong, and J.S. Liu, "A Data-Driven Clustering Method for Time Course Gene Expression Data," Nucleic Acids Research, vol. 34, no. 4, pp. 1261-1269, 2006.
[26] C.S. Möller-Levet, F. Klawonn, K. Cho, H. Yin, and O. Wolkenhauer, "Clustering of Unevenly Sampled Gene Expression Time-Series Data," Fuzzy Sets and Systems, vol. 152, no. 1, pp. 49-66, 2005.
[27] C. Orabona, P. Puccetti, C. Vacca, S. Bicciato, A. Luchini, F. Fallarino, R. Bianchi, E. Velardi, K. Perruccio, A. Velardi, V. Bronte, M.C. Fioretti, and U. Grohmann, "Toward the Identification of a Tolerogenic Signature in ido Competent Dendritic Cells," Blood, vol. 107, no. 7, pp. 2846-2854, 2006.
[28] R.B. Rock, S. Hu, A. Deshpande, S. Munir, B.J. May, C.A. Baker, P.K. Peterson, and V. Kapur, "Transcriptional Response of Human Microglial Cells to Interferon-Gamma," Genes and Immunity, vol. 6, no. 8, pp. 712-719, 2005.
[29] D. Sahoo, D.L. Dill, R. Tibshirani, and S.K. Plevritis, "Extracting Binary Signals from Microarray Time-Course Data," Nucleic Acids Research, vol. 35, no. 11, pp. 3705-3712, 2007.
[30] C. Sanda, P. Weitzel, T. Tsukahara, J. Schaley, H.J. Edenberg, M.A. Stephens, J.N. McClintick, L.M. Blatt, L. Li, L. Brodsky, and M.W. Taylor, "Differential Gene Induction by Type i and Type ii Interferons and Their Combination," J. Interferon and Cytokine Research, vol. 26, no. 7, pp. 462-472, 2006.
[31] A. Schliep, A. Schönhuth, and C. Steinhoff, "Using Hidden Markov Models to Analyze Gene Expression Time Course Data," Bioinformatics, vol. 19, pp. i255-i263, 2003.
[32] J. Shawe-Taylor and N. Cristianini, Kernel Methods for Pattern Analysis. Cambridge Univ. Press, 2004.
[33] M. Sirois, L. Robitaille, M. Shah, C. Woelk, and J. Corbeil, "HIV Modulates Host Gene Expression in Macrophages after IFN Alpha2 Treatment," Proc. 15th Conf. Retroviruses and Opportunistic Infections (CROI '08), pp. D-119, 2008.
[34] L. Song, J. Bedo, K.M. Borgwardt, A. Gretton, and A. Smola, "Gene Selection via the BAHSIC Family of Algorithms," Bioinformatics, vol. 23, no. 13, pp. i490-i498, 2007.
[35] L. Song, A. Smola, A. Gretton, K.M. Borgwardt, and J. Bedo, "Supervised Feature Selection via Dependence Estimation," Proc. 24th Int'l Conf. Machine Learning (ICML '07), pp. 823-830, 2007.
[36] R. Tibshirani, T. Hastie, B. Narasimhan, and G. Chu, "Diagnosis of Multiple Cancer Types by Shrunken Centroids of Gene Expression," Proc. Nat'l Academy of Sciences USA, vol. 99, pp. 6567-6572, 2002.
[37] R. Tibshirani, T. Hastie, B. Narasimhan, and G. Chu, "Class Prediction by Nearest Shrunken Centroids with Applications to dna Microarrays," Statistical Science, vol. 18, pp. 104-117, 2003.
[38] V.G. Tusher, R. Tibshirani, and G. Chu, "Significance Analysis of Microarrays Applied to the Ionizing Radiation Response," Proc. Nat'l Academy of Sciences USA, vol. 98, pp. 5116-5121, 2001.
[39] L.J. vant Veer, H. Dai, M.J. van de Vijver, Y.D. He, A.A.M. Hart, M. Mao, H.L. Peterse, K. van der Kooy, M.J. Marton, A.T. Witteveen, G.J. Schreiber, R.M. Kerkhoven, C. Roberts, P.S. Linsley, R. Bernards, and S.H. Friend, "Gene Expression Profiling Predicts Clinical Outcome of Breast Cancer," Nature, vol. 415, pp. 530-536, 2002.
[40] C.H. Woelk, F. Ottones, C.R. Plotkin, P. Du, C.D. Royer, S.E. Rought, J. Lozach, R. Sasik, R.S. Kornbluth, D.D. Richman, and J. Corbeil, "Interferon Gene Expression Following HIV-1 Infection of Monocyte-Derived Macrophages," AIDS Research and Human Retroviruses, vol. 20, no. 11, pp. 1210-1222, 2004.

Index Terms:
Short time-series microarray data, HSIC, differentially expressed genes, gene behavior patterns.
Mohak Shah, Jacques Corbeil, "A General Framework for Analyzing Data from Two Short Time-Series Microarray Experiments," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 8, no. 1, pp. 14-26, Jan.-Feb. 2011, doi:10.1109/TCBB.2009.51
Usage of this product signifies your acceptance of the Terms of Use.