This Article 
 Bibliographic References 
 Add to: 
A Statistical Change Point Model Approach for the Detection of DNA Copy Number Variations in Array CGH Data
October-December 2009 (vol. 6 no. 4)
pp. 529-541
Jie Chen, University of Missouri-Kansas City, Kansas City
Yu-Ping Wang, University of Missouri-Kansas City, Kansas City
Array comparative genomic hybridization (aCGH) provides a high-resolution and high-throughput technique for screening of copy number variations (CNVs) within the entire genome. This technique, compared to the conventional CGH, significantly improves the identification of chromosomal abnormalities. However, due to the random noise inherited in the imaging and hybridization process, identifying statistically significant DNA copy number changes in aCGH data is challenging. We propose a novel approach that uses the mean and variance change point model (MVCM) to detect CNVs or breakpoints in aCGH data sets. We derive an approximate p-value for the test statistic and also give the estimate of the locus of the DNA copy number change. We carry out simulation studies to evaluate the accuracy of the estimate and the p-value formulation. These simulation results show that the approach is effective in identifying copy number changes. The approach is also tested on fibroblast cancer cell line data, breast tumor cell line data, and breast cancer cell line aCGH data sets that are publicly available. Changes that have not been identified by the circular binary segmentation (CBS) method but are biologically verified are detected by our approach on these cell lines with higher sensitivity and specificity than CBS.

[1] A. Kallioniemi, O.-P. Kallioniemi, D. Sudar, D. Rutovitz, J.W. Gray, F. Waldman, and D. Pinkel, “Comparative Genomic Hybridization for Molecular Cytogenetic Analysis of Solid Tumors,” Science, vol. 258, pp. 818-821, 1992.
[2] D. Pinkel, R. Seagraves, D. Sudar, S. Clark, I. Poole, D. Kowbel, C. Collins, W.-L. Kuo, C. Chen, Y. Zhai, Y. Zhai, S. Dairkee, B.-M. Ljjung, J.W. Gray, and D. Albertson, “High Resolution Analysis of DNA Copy Number Variation Using Comparative Genomic Hybridization to Microarrays,” Nature Genetics, vol. 20, pp. 207-211, 1998.
[3] R. Lucito, J. West, A. Reiner, D. Alexander, D. Esposito, B. Mishra, S. Powers, L. Norton, and M. Wigler, “Detecting Gene Copy Number Fluctuations in Tumor Cells by Microarray Analysis of Genomic Representations,” Genome Research, vol. 10, pp. 1726-36, 2000.
[4] J.R. Pollack, C.M. Perou, A.A. Alizadeh, M.B. Eisen, A. Pergamenschikov, C.F. Williams, S.S. Jeffrey, D. Botstein, and P.O. Brown, “Genome-Wide Analysis of DNA Copy-Number Changes Using cDNA Microarrays,” Nature Genetics, vol. 23, pp. 41-46, 1999.
[5] C.L. Myers, M.J. Dunham, S.Y. Kung, and O.G. Troyanskaya, “Accurate Detection of Aneuploidies in Array CGH and Gene Expression Microarray Data,” Bioinformatics, vol. 20, pp. 3533-3543, 2004.
[6] A.B. Olshen, E.S. Venkatraman, R. Lucito, and M. Wigler, “Circular Binary Segmentation for the Analysis of Array-Based DNA Copy Number Data,” Biostatistics, vol. 5, pp. 557-572, 2004.
[7] G. Hodgson, J.H. Hager, S. Volik, S. Hariono, M. Wernick, D. Moore, N. Nowak, D.G. Albertson, D. Pinkel, C. Collins, D. Hanahan, and J.W. Gray, “Genome Scanning with Array CGH Delineates Regional Alterations in Mouse Islet Carcinomas,” Nature Genetics, vol. 29, pp. 459-464, 2001.
[8] J.R. Pollack, T. Sorlie, C.M. Perou, C.A. Rees, S.S. Jeffrey, P.E. Lonning, R. Tibshirani, D. Botstein, A.L. Borresen-Dale, and P.O. Brown, “Microarray Analysis Reveals a Major Direct Role of DNA Copy Number Alteration in the Transcriptional Program of Human Breast Tumors,” Proc. Nat'l Academy of Sciences USA, vol. 99, pp. 12963-12968, 2002.
[9] M.M. Weiss, A.M. Snijders, E.J. Kuipers, B. Ylstra, D. Pinkel, S.G.M. Meuwissen, P.J. van Diest, D.G. Albertson, and G.A. Meijer, “Determination of Amplicon Boundaries at 20q13.2 in Tissue Samples of Human Gastric Adenocarcinomas by High-Resolution Microarray Comparative Genomic Hybridization,” The J. Pathology, vol. 200, pp. 320-326, 2003.
[10] R. Lucito, J. Healy, J. Alexander, A. Reiner, D. Esposito, M. Chi, L. Rodgers, A. Brady, J. Sebat, J. Troge, J.A. West, S. Rostan, K.C. Nguyen, S. Powers, K.Q. Ye, A. Olshen, E. Venkatraman, L. Norton, and M. Wigler, “Representational Oligonucleotide Microarray Analysis: A High-Resolution Method to Detect Genome Copy Number Variation,” Genome Research, vol. 13, pp. 2291-2305, 2003.
[11] R. Autio, S. Hautaniemi, P. Kauraniemi, O. Yli-Harja, J. Astola, M. Wolf, and A. Kallioniemi, “CGH-Plotter: MATLAB Toolbox for CGH-data Analysis,” Bioinformatics, vol. 19, pp. 1714-1715, 2003.
[12] X. Zhao, B.A. Weir, T. La Framboise, M. Lin, R. Beroukhim, L. Garraway, J. Beheshti, J.C. Lee, K. Naoki, W.G. Richards, D. Sugarbaker, F. Chen, M.A. Rubin, P.A. Janne, L. Girard, J. Minna, D. Christiani, C. Li, W.R. Sellers, and M. Meyerson, “Homozygous Deletions and Chromosome Amplifications in Human Lung Carcinomas Revealed by Single Nucleotide Polymorphism Array Analysis,” Cancer Research, vol. 65, pp. 5561-5570, 2005.
[13] J. Fridlyand, A.M. Snijders, D. Pinkel, D.G. Albertson, and A.N. Jain, “Hidden Markov Models Approach to the Analysis of Array CGH Data,” J. Multivariate Analysis, vol. 90, pp. 132-153, 2004.
[14] P. Hupé, N. Stransky, J. Thiery, F. Radvanyi, and E. Barillot, “Analysis of Array CGH Data: From Signal Ratio to Gain and Loss of DNA Regions,” Bioinformatics, vol. 20, pp. 3413-3422, 2004.
[15] F. Picard, S. Robin, M. Lavielle, C. Vaisse, and J. Daudin, “A Statistical Approach for Array CGH Data Analysis,” BMC Bioinformatics, vol. 6, article 27, 2005.
[16] W.R. Lai, D. Mark, M.D. Johnson, R. Raju Kucherlapati, and P.J. Park, “Comparative Analysis of Algorithms for Identifying Amplifications and Deletions in Array CGH Data,” Bioinformatics, vol. 21, pp. 3763-3770, 2005.
[17] J. Chen and A.K. Gupta, Parametric Statistical Change Point Analysis. Birkhäuser, 2000.
[18] E.S. Page, “A Test for a Change in a Parameter Occurring at an Unknown Point,” Biometrika, vol. 42, pp. 523-527, 1955.
[19] H. Chernoff and S. Zacks, “Estimating the Current Mean of a Normal Distribution Which Is Subject to Change in Time,” Annals of Math. Statistics, vol. 35, pp. 999-1018, 1964.
[20] A. Sen and M.S. Srivastava, “On Tests for Detecting a Change in Mean,” Annals of Statistics, vol. 3, pp. 98-108, 1975.
[21] K.J. Worsley, “On the Likelihood Ratio Test for a Shift in Location of Normal Populations,” J. Am. Statistical Assoc., vol. 74, pp. 365-367, 1979.
[22] L. Horváth, “The Maximum Likelihood Methods for Testing Changes in the Parameters of Normal Observations,” Annals of Statistics, vol. 21, pp. 671-680, 1993.
[23] E.L. Lehmann, Testing Statistical Hypotheses, second ed. Wiley, 1986.
[24] J. Chen and A.K. Gupta, “Likelihood Procedure for Testing Change Points Hypothesis for Multivariate Gaussian Model,” Random Operators and Stochastic Equations, vol. 3, pp. 235-244, 1995.
[25] Q. Yao, “Tests for Change-Points with Epidemic Alternatives,” Biometrika, vol. 80, pp. 179-191, 1993.
[26] Y.-C. Yao, “Estimating the Number of Change-Points via Schwarz' Criterion,” Statistics and Probability Letters, vol. 6, pp. 181-189, 1988.
[27] A.K. Gupta and J. Chen, “Detecting Changes of Mean in Multidimensional Normal Sequences with Application to Literature and Geology,” Computational Statistics, vol. 11, pp. 211-221, 1996.
[28] J. Chen and A.K. Gupta, “Testing and Locating Variance Change Points with Application to Stock Prices,” J. Am. Statistical Assoc., vol. 92, pp. 739-747, 1997.
[29] J. Chen and A.K. Gupta, “Change Point Analysis of a Gaussian Model,” Statistical Papers, vol. 40, pp. 323-333, 1999.
[30] J. Chen and A.K. Gupta, “On Change Point Detection and Estimation,” Comm. Statistics-Simulation and Computation, vol. 30, pp. 665-697, 2001.
[31] J. Chen and A.K. Gupta, “Information-Theoretic Approach for Detecting Change in the Parameters of a Normal Model,” Math. Methods of Statistics, vol. 12, pp. 116-130, 2003.
[32] J. Chen and A.K. Gupta, “Statistical Inference of Covariance Change Points in Gaussian Model,” Statistics, vol. 38, pp. 17-28, 2004.
[33] S.C. Linn, R.B. West, J.R. Pollack, S. Zhu, T. Hernandez-Boussard, T.O. Nielsen, B.P. Rubin, R. Patel, J.R. Goldblum, D. Siegmund, D. Botstein, P.O. Brown, C.B. Gilks, and M. van de Rijn, “Gene Expression Patterns and Gene Copy Number Changes in Dermatofibrosarcoma Protuberans,” Am. J. Pathology, vol. 163, pp. 2383-2395, 2003.
[34] E.S. Venkatraman and A.B. Olshen, “A Faster Circular Binary Segmentation Algorithm for the Analysis of Array CGH Data,” Bioinformatics, vol. 23, pp. 657-663, 2007.
[35] S. Guha, Y. Li, and D. Donna Neuberg, “Bayesian Hidden Markov Modeling of Array CGH Data,” Harvard Univ. Biostatistics Working Paper Series–Working Paper 24, Oct. 2006.
[36] T.L. Lai, H. Xing, and N. Zhang, “Stochastic Segmentation Models for Array-Based Comparative Genomic Hybridization Data Analysis,” Biostatistics, vol. 9, pp. 290-307, 2008.
[37] L.J.U. Vostrikova, “Detecting Disorder in Multidimensional Random Processes,” Soviet Math.–Doklady, vol. 24, pp. 55-59, 1981.
[38] G. Schwarz, “Estimating the Dimension of a Model,” Annals of Statistics, vol. 6, pp. 461-464, 1978.
[39] A.M. Snijders, N. Nowak, R. Segraves, S. Blackwood, N. Brown, J. Conroy, G. Hamilton, A.K. Hindle, B. Huey, K. Kimura, S. Law, K. Myambo, J. Palmer, B. Ylstra, J.P. Yue, J.W. Gray, A.N. Jain, D. Pinkel, and D.G. Alberston, “Assembly of Microarrays for Genome-Wide Measurement of DNA Copy Number,” Nature Genetics, vol. 29, pp. 263-264, 2001.
[40] “The Fibroblast Cell Lines Data,” fullng754.html, 2009.
[41] J. Chen and Y. Wang, “Detection of DNA Copy Number Changes Using Statistical Change Point Analysis,” Proc. 2006 IEEE Int'l Workshop Genomic Signal Processing and Statistics, pp. 11-12, 2006.
[42] ROMA Data Website, http://roma.cshl.orghuman.html, 2009.

Index Terms:
Statistical hypothesis testing, aCGH microarray data, gene expression, DNA copy numbers, CNVs.
Jie Chen, Yu-Ping Wang, "A Statistical Change Point Model Approach for the Detection of DNA Copy Number Variations in Array CGH Data," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 6, no. 4, pp. 529-541, Oct.-Dec. 2009, doi:10.1109/TCBB.2008.129
Usage of this product signifies your acceptance of the Terms of Use.