This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Network-Based Inference of Cancer Progression from Microarray Data
April-June 2009 (vol. 6 no. 2)
pp. 200-212
Yongjin Park, Carnegie Mellon University, Pittsburgh
Stanley Shackney, Drexel University, Pittsburgh
Russell Schwartz, Carnegie Mellon University, Pittsburgh
Cancer cells exhibit a common phenotype of uncontrolled cell growth, but this phenotype may arise from many different combinations of mutations. By inferring how cells evolve in individual tumors, a process called cancer progression, we may be able to identify important mutational events for different tumor types, potentially leading to new therapeutics and diagnostics. Prior work has shown that it is possible to infer frequent progression pathways by using gene expression profiles to estimate “distances” between tumors. Here, we apply gene network models to improve these estimates of evolutionary distance by controlling for correlations among coregulated genes. We test three variants of this approach: one using an optimized best-fit network, another using sampling to infer a high-confidence subnetwork, and one using a modular network inferred from clusters of similarly expressed genes. Application to lung cancer and breast cancer microarray data sets shows small improvements in phylogenies when correcting from the optimized network and more substantial improvements when correcting from the sampled or modular networks. Our results suggest that a network correction approach improves estimates of tumor similarity, but sophisticated network models are needed to control for the large hypothesis space and sparse data currently available.

[1] J.R. Antoniak, “Mixtures of Dirichlet Processes with Applications to Bayesian Nonparametric Problems,” Annals of Statistics, vol. 2, pp.1152-1174, 1974.
[2] T.H. Cormen, C.A. Leiserson, R.L. Rivest, and C. Stein, Introduction to Algorithms. MIT Press, 2001.
[3] R. Desper, J. Khan, and A.A. Schaffer, “Tumor Classification Using Phylogenetic Methods on Expression Data,” J. Theoretical Biology, vol. 228, pp.477-496, 2004.
[4] Z.H. Fang and Z.C. Han, “The Transcription Factor E2F: A Crucial Switch in the Control of Homeostasis and Tumorigenesis,” Histology and Histopathology, vol. 21, pp.403-413, 2006.
[5] J. Felsenstein, “Phylip—Phylogeny Inference Package,” Cladistics, vol. 5, pp.164-166, 1989.
[6] W.M. Fitch and E. Margoliash, “Construction of Phylogenetic Trees,” Science, vol. 155, pp.279-284, 1967.
[7] N. Friedman, M. Linial, I. Nachman, and D. Pe'er, “Using Bayesian Networks to Analyze Expression Data,” J. Computational Biology, vol. 7, pp.601-620, 2000.
[8] N. Friedman and D. Koller, “Being Bayesian about Bayesian Network Structure: A Bayesian Approach to Structure Discovery in Bayesian Networks,” Machine Learning, vol. 50, pp.95-125, 2003.
[9] T.R. Golub, D.K. Slonim, P. Tamayo, C. Huard, M. Gaasenbeej, J.P. Mesirov, H. Coller, M.L. Loh, J.R. Downing, M.A. Cligiuri, C.D. Bloomfield, and E.S. Lander, “Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring,” Science, vol. 286, pp.531-537, 1999.
[10] M.H. Jones, C. Virtanen, D. Honjoh, T. Miyoshi, Y. Satoh, S. Okumura, K. Nakagawa, H. Nomura, and Y. Ishikawa, “Two Prognostically Significant Subtypes of High-Grade Lung Neuroenedocrine Tumours Independent of Small-Cell and Large-Cell Neuroendocrine Carcinomas Identified by Gene Expression Profiles,” Lancet, vol. 363, pp.775-781, 2004.
[11] M.I. Jordan, Z. Ghahramani, T. Jaakkola, and L.K. Saul, “An Introduction to Variational Methods for Graphical Models,” Machine Learning, vol. 37, pp.183-233, 1999.
[12] S. Kim, S. Imoto, and S. Miyano, “Dynamic Bayesian Network and Nonparametric Regression for Nonlinear Modeling of Gene Networks from Time Series Gene Expression Data,” Biosystems, vol. 75, pp.57-65, 2004.
[13] S. Maere, K. Heymans, and M. Kuiper, “BiNGO: A Cytoscape Plugin to Assess Overrepresentation of Gene Ontology Categories in Biological Networks,” Bioinformatics, vol. 21, pp.3448-3449, 2005.
[14] Y. Miki, J. Swensen, D. Shauttuck-Eidens, P.A. Futreal, K. Harchman, S. Tavtigian, Q. Liu, C. Cichran, L.M. Bennet, W. Ding, R. Bell, J. Rosenthal, C. Hussey, T. Tran, M. McClure, C. Frye, T. Hattier, R. Phelps, A. Haugen-Strano, Y. Katcher, K. Yakumo, Z. Gholami, D. Shaffer, S. Stone, S. Bayer, C. Wray, R. Bogden, P. Dayananth, J. Ward, P. Tonin, S. Narod, P.K. Bristow, F.H. Norris, L. Helvering, P. Morrison, P. Rosteck, M. Lai, J.C. Barrett, C. Lewis, S. Neuhausen, L. Cannon-Albright, D. Goldgar, R. Wiseman, A. Kamb, and M.H. Skolnick, “A Strong Candidate for the Breast and Ovarian Cancer Susceptibility Gene BRCA1,” Science, vol. 266, pp.66-71, 1994.
[15] K. Murphy, “Bayes Net Toolbox for Matlab,” http://www.cs.ubc.ca/murphyk/Software/BNT bnt.html, 2007.
[16] R.M. Neal, “Markov Chain Sampling Methods for Dirichlet Process Mixture Models,” J. Computational and Graphical Statistics, vol. 9, no. 2, pp.249-265, 2000.
[17] P.C. Nowell, “The Clonal Evolution of Tumor Cell Populations,” Science, vol. 194, pp.23-28, 1976.
[18] Y. Park, S. Shackney, and R. Schwartz, “Network-Based Inference of Cancer Progression from Microarray Data,” Proc. 2008 Int'l Symp. Bioinformatics Research and Applications, pp.268-269, 2008.
[19] M.D. Pegram, G. Konecny, and D.J. Slamon, “The Molecular and Cellular Biology of Her2/neu Gene Amplification/Overexpression and the Clinical Development of Herceptin (Trastuzumab) Therapy for Breast Cancer,” Cancer Treatment and Research, vol. 103, pp.57-75, 2000.
[20] C.M. Perou, T. Sorlie, M.B. Eisen, M. van de Rijn, S.S. Jeffrey, C.A. Rees, J.R. Pollack, D.T. Ross, H. Johnsen, L.A. Akslen, O. Fluge, A. Pergamenschikov, C. Williams, S.X. Zhu, P.E. Lonning, A.-L. Borresen-Dale, P.O. Brown, and D. Botstein, “Molecular Portraits of Human Breast Tumors,” Nature, vol. 406, pp.747-752, 2000.
[21] Z.S. Qin, “Clustering Microarray Gene Expression Data Using Weighted Chinese Restaurant Process,” Bioinformatics, vol. 22, no. 16, pp.1988-97, 2006.
[22] C.E. Rasmussen, “The infinite Gaussian Mixture Model,” Advances in Neural Information Processing Systems, S.A. Solla, T. K. Lean, and K.R. Muller, eds., vol. 12, pp.554-560, MIT Press, 2000.
[23] J.S. Reis-Filho and A.N. Tutt, “Triple Negative Tumors: A Critical Review,” Histopathology, vol. 52, no. 1, pp.108-118, 2008.
[24] M. Schmidt, A. Niculescu-Mizil, and K. Murphy, “Learning Graphical Model Structure Using L1-Regularization Paths,” Proc. 22nd Conf. Artificial Intelligence (AAAI '07), pp.1278-1283, 2007.
[25] E. Segal, M. Shapira, A. Regev, D. Pe'er, D. Botstein, D. Koller, and N. Friedman, “Module Networks: Identifying Regulatory Modules and Their Condition-Specific Regulators from Gene Expression Data,” Nature Genetics, vol. 34, no. 2, pp.166-176, 2003.
[26] S.E. Shackney and J.F. Silverman, “Molecular Evolutionary Patterns in Breast Cancer,” Anatomic Pathology, vol. 10, pp.278-290, 2003.
[27] T. Sorlie, C.M. Perou, R. Tibshirani, T. Aas, S. Geisler, H. Johnsen, T. Hastie, M.B. Eisen, M. van de Rijn, S.S. Jeffrey, T. Thorsen, H. Quist, J.C. Matese, P.O. Brown, D. Botstein, P.E. Lonning, and A.-L. Borresen-Dale, “Gene Expression Patterns of Breast Carcinomas Distinguish Tumor Subclasses with Clinical Implications,” Proc. Nat'l Academy Sciences USA, vol. 98, pp.10869-10874, 2001.
[28] S. Tavazoie, J.D. Hughes, M.J. Campbell, R.J. Cho, and G.M. Church, “Systematic Determination of Genetic Network Architecture,” Nature Genetics, vol. 22, pp.281-285, 1999.
[29] M. Teyssier and D. Koller, “Ordering-Based Search: A Simple and Effective Algorithm for Learning Bayesian Networks,” Proce. 21st Ann. Conf. Uncertainty in Artificial Intelligence (UAI-05), pp.584-59, 2005.
[30] P.K. Tsantoulis and V.G. Gorgoulis, “Involvement of E2F Transcription Factor Family in Cancer,” European J. Cancer, vol. 41, pp.2403-2413, 2005.
[31] M. van de Vijver, R. van de Berselaar, P. Devilee, C. Cornelisse, J. Peterse, and R. Nusse, “Amplification of the neu (c-erbB-2) Oncogene in Human Mammary Tumors Is Relatively Frequent and Is Often Accompanied by the Amplification of the Linked c-erbA Oncogene,” Molecular and Cellular Biology, vol. 7, no. 5, pp.2019-2023, 1987.
[32] L. van't Veer, H. Dai, M. van de Vijver, Y. He, A. Hart, M. Mao, H. Peterse, K. van der Kooy, M. Marton, A. Witteveen, G. Schreiber, R. Kerkhoven, C. Roberts, P. Linsley, R. Bernards, and S. Friend, “Gene Expression Profiling Predicts Clinical Outcome of Breast Cancer,” Nature, vol. 415, no. 6871, pp.530-536, 2002.
[33] R. Wooster, G. Bignell, J. Lancaster, S. Swift, S. Seal, J. Mangion, N. Collins, S. Gregory, C. Gumbs, and G. Micklem, “Identification of the Breast Cancer Susceptibility Gene BRCA2,” Nature, vol. 378, no. 6559, pp.789-792, 1995.

Index Terms:
Biology and genetics, graphs and networks, trees, machine learning.
Citation:
Yongjin Park, Stanley Shackney, Russell Schwartz, "Network-Based Inference of Cancer Progression from Microarray Data," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 6, no. 2, pp. 200-212, April-June 2009, doi:10.1109/TCBB.2008.126
Usage of this product signifies your acceptance of the Terms of Use.