The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.02 - April-June (2009 vol.6)
pp: 200-212
Yongjin Park , Carnegie Mellon University, Pittsburgh
Stanley Shackney , Drexel University, Pittsburgh
Russell Schwartz , Carnegie Mellon University, Pittsburgh
ABSTRACT
Cancer cells exhibit a common phenotype of uncontrolled cell growth, but this phenotype may arise from many different combinations of mutations. By inferring how cells evolve in individual tumors, a process called cancer progression, we may be able to identify important mutational events for different tumor types, potentially leading to new therapeutics and diagnostics. Prior work has shown that it is possible to infer frequent progression pathways by using gene expression profiles to estimate “distances” between tumors. Here, we apply gene network models to improve these estimates of evolutionary distance by controlling for correlations among coregulated genes. We test three variants of this approach: one using an optimized best-fit network, another using sampling to infer a high-confidence subnetwork, and one using a modular network inferred from clusters of similarly expressed genes. Application to lung cancer and breast cancer microarray data sets shows small improvements in phylogenies when correcting from the optimized network and more substantial improvements when correcting from the sampled or modular networks. Our results suggest that a network correction approach improves estimates of tumor similarity, but sophisticated network models are needed to control for the large hypothesis space and sparse data currently available.
INDEX TERMS
Biology and genetics, graphs and networks, trees, machine learning.
CITATION
Yongjin Park, Stanley Shackney, Russell Schwartz, "Network-Based Inference of Cancer Progression from Microarray Data", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.6, no. 2, pp. 200-212, April-June 2009, doi:10.1109/TCBB.2008.126
REFERENCES
[1] J.R. Antoniak, “Mixtures of Dirichlet Processes with Applications to Bayesian Nonparametric Problems,” Annals of Statistics, vol. 2, pp.1152-1174, 1974.
[2] T.H. Cormen, C.A. Leiserson, R.L. Rivest, and C. Stein, Introduction to Algorithms. MIT Press, 2001.
[3] R. Desper, J. Khan, and A.A. Schaffer, “Tumor Classification Using Phylogenetic Methods on Expression Data,” J. Theoretical Biology, vol. 228, pp.477-496, 2004.
[4] Z.H. Fang and Z.C. Han, “The Transcription Factor E2F: A Crucial Switch in the Control of Homeostasis and Tumorigenesis,” Histology and Histopathology, vol. 21, pp.403-413, 2006.
[5] J. Felsenstein, “Phylip—Phylogeny Inference Package,” Cladistics, vol. 5, pp.164-166, 1989.
[6] W.M. Fitch and E. Margoliash, “Construction of Phylogenetic Trees,” Science, vol. 155, pp.279-284, 1967.
[7] N. Friedman, M. Linial, I. Nachman, and D. Pe'er, “Using Bayesian Networks to Analyze Expression Data,” J. Computational Biology, vol. 7, pp.601-620, 2000.
[8] N. Friedman and D. Koller, “Being Bayesian about Bayesian Network Structure: A Bayesian Approach to Structure Discovery in Bayesian Networks,” Machine Learning, vol. 50, pp.95-125, 2003.
[9] T.R. Golub, D.K. Slonim, P. Tamayo, C. Huard, M. Gaasenbeej, J.P. Mesirov, H. Coller, M.L. Loh, J.R. Downing, M.A. Cligiuri, C.D. Bloomfield, and E.S. Lander, “Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring,” Science, vol. 286, pp.531-537, 1999.
[10] M.H. Jones, C. Virtanen, D. Honjoh, T. Miyoshi, Y. Satoh, S. Okumura, K. Nakagawa, H. Nomura, and Y. Ishikawa, “Two Prognostically Significant Subtypes of High-Grade Lung Neuroenedocrine Tumours Independent of Small-Cell and Large-Cell Neuroendocrine Carcinomas Identified by Gene Expression Profiles,” Lancet, vol. 363, pp.775-781, 2004.
[11] M.I. Jordan, Z. Ghahramani, T. Jaakkola, and L.K. Saul, “An Introduction to Variational Methods for Graphical Models,” Machine Learning, vol. 37, pp.183-233, 1999.
[12] S. Kim, S. Imoto, and S. Miyano, “Dynamic Bayesian Network and Nonparametric Regression for Nonlinear Modeling of Gene Networks from Time Series Gene Expression Data,” Biosystems, vol. 75, pp.57-65, 2004.
[13] S. Maere, K. Heymans, and M. Kuiper, “BiNGO: A Cytoscape Plugin to Assess Overrepresentation of Gene Ontology Categories in Biological Networks,” Bioinformatics, vol. 21, pp.3448-3449, 2005.
[14] Y. Miki, J. Swensen, D. Shauttuck-Eidens, P.A. Futreal, K. Harchman, S. Tavtigian, Q. Liu, C. Cichran, L.M. Bennet, W. Ding, R. Bell, J. Rosenthal, C. Hussey, T. Tran, M. McClure, C. Frye, T. Hattier, R. Phelps, A. Haugen-Strano, Y. Katcher, K. Yakumo, Z. Gholami, D. Shaffer, S. Stone, S. Bayer, C. Wray, R. Bogden, P. Dayananth, J. Ward, P. Tonin, S. Narod, P.K. Bristow, F.H. Norris, L. Helvering, P. Morrison, P. Rosteck, M. Lai, J.C. Barrett, C. Lewis, S. Neuhausen, L. Cannon-Albright, D. Goldgar, R. Wiseman, A. Kamb, and M.H. Skolnick, “A Strong Candidate for the Breast and Ovarian Cancer Susceptibility Gene BRCA1,” Science, vol. 266, pp.66-71, 1994.
[15] K. Murphy, “Bayes Net Toolbox for Matlab,” http://www.cs.ubc.ca/murphyk/Software/BNT bnt.html, 2007.
[16] R.M. Neal, “Markov Chain Sampling Methods for Dirichlet Process Mixture Models,” J. Computational and Graphical Statistics, vol. 9, no. 2, pp.249-265, 2000.
[17] P.C. Nowell, “The Clonal Evolution of Tumor Cell Populations,” Science, vol. 194, pp.23-28, 1976.
[18] Y. Park, S. Shackney, and R. Schwartz, “Network-Based Inference of Cancer Progression from Microarray Data,” Proc. 2008 Int'l Symp. Bioinformatics Research and Applications, pp.268-269, 2008.
[19] M.D. Pegram, G. Konecny, and D.J. Slamon, “The Molecular and Cellular Biology of Her2/neu Gene Amplification/Overexpression and the Clinical Development of Herceptin (Trastuzumab) Therapy for Breast Cancer,” Cancer Treatment and Research, vol. 103, pp.57-75, 2000.
[20] C.M. Perou, T. Sorlie, M.B. Eisen, M. van de Rijn, S.S. Jeffrey, C.A. Rees, J.R. Pollack, D.T. Ross, H. Johnsen, L.A. Akslen, O. Fluge, A. Pergamenschikov, C. Williams, S.X. Zhu, P.E. Lonning, A.-L. Borresen-Dale, P.O. Brown, and D. Botstein, “Molecular Portraits of Human Breast Tumors,” Nature, vol. 406, pp.747-752, 2000.
[21] Z.S. Qin, “Clustering Microarray Gene Expression Data Using Weighted Chinese Restaurant Process,” Bioinformatics, vol. 22, no. 16, pp.1988-97, 2006.
[22] C.E. Rasmussen, “The infinite Gaussian Mixture Model,” Advances in Neural Information Processing Systems, S.A. Solla, T. K. Lean, and K.R. Muller, eds., vol. 12, pp.554-560, MIT Press, 2000.
[23] J.S. Reis-Filho and A.N. Tutt, “Triple Negative Tumors: A Critical Review,” Histopathology, vol. 52, no. 1, pp.108-118, 2008.
[24] M. Schmidt, A. Niculescu-Mizil, and K. Murphy, “Learning Graphical Model Structure Using L1-Regularization Paths,” Proc. 22nd Conf. Artificial Intelligence (AAAI '07), pp.1278-1283, 2007.
[25] E. Segal, M. Shapira, A. Regev, D. Pe'er, D. Botstein, D. Koller, and N. Friedman, “Module Networks: Identifying Regulatory Modules and Their Condition-Specific Regulators from Gene Expression Data,” Nature Genetics, vol. 34, no. 2, pp.166-176, 2003.
[26] S.E. Shackney and J.F. Silverman, “Molecular Evolutionary Patterns in Breast Cancer,” Anatomic Pathology, vol. 10, pp.278-290, 2003.
[27] T. Sorlie, C.M. Perou, R. Tibshirani, T. Aas, S. Geisler, H. Johnsen, T. Hastie, M.B. Eisen, M. van de Rijn, S.S. Jeffrey, T. Thorsen, H. Quist, J.C. Matese, P.O. Brown, D. Botstein, P.E. Lonning, and A.-L. Borresen-Dale, “Gene Expression Patterns of Breast Carcinomas Distinguish Tumor Subclasses with Clinical Implications,” Proc. Nat'l Academy Sciences USA, vol. 98, pp.10869-10874, 2001.
[28] S. Tavazoie, J.D. Hughes, M.J. Campbell, R.J. Cho, and G.M. Church, “Systematic Determination of Genetic Network Architecture,” Nature Genetics, vol. 22, pp.281-285, 1999.
[29] M. Teyssier and D. Koller, “Ordering-Based Search: A Simple and Effective Algorithm for Learning Bayesian Networks,” Proce. 21st Ann. Conf. Uncertainty in Artificial Intelligence (UAI-05), pp.584-59, 2005.
[30] P.K. Tsantoulis and V.G. Gorgoulis, “Involvement of E2F Transcription Factor Family in Cancer,” European J. Cancer, vol. 41, pp.2403-2413, 2005.
[31] M. van de Vijver, R. van de Berselaar, P. Devilee, C. Cornelisse, J. Peterse, and R. Nusse, “Amplification of the neu (c-erbB-2) Oncogene in Human Mammary Tumors Is Relatively Frequent and Is Often Accompanied by the Amplification of the Linked c-erbA Oncogene,” Molecular and Cellular Biology, vol. 7, no. 5, pp.2019-2023, 1987.
[32] L. van't Veer, H. Dai, M. van de Vijver, Y. He, A. Hart, M. Mao, H. Peterse, K. van der Kooy, M. Marton, A. Witteveen, G. Schreiber, R. Kerkhoven, C. Roberts, P. Linsley, R. Bernards, and S. Friend, “Gene Expression Profiling Predicts Clinical Outcome of Breast Cancer,” Nature, vol. 415, no. 6871, pp.530-536, 2002.
[33] R. Wooster, G. Bignell, J. Lancaster, S. Swift, S. Seal, J. Mangion, N. Collins, S. Gregory, C. Gumbs, and G. Micklem, “Identification of the Breast Cancer Susceptibility Gene BRCA2,” Nature, vol. 378, no. 6559, pp.789-792, 1995.
21 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool