CSDL Home IEEE/ACM Transactions on Computational Biology and Bioinformatics 2009 vol.6 Issue No.04 - October-December
A Statistical Change Point Model Approach for the Detection of DNA Copy Number Variations in Array CGH Data
Issue No.04 - October-December (2009 vol.6)
Jie Chen , University of Missouri-Kansas City, Kansas City
Yu-Ping Wang , University of Missouri-Kansas City, Kansas City
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TCBB.2008.129
Array comparative genomic hybridization (aCGH) provides a high-resolution and high-throughput technique for screening of copy number variations (CNVs) within the entire genome. This technique, compared to the conventional CGH, significantly improves the identification of chromosomal abnormalities. However, due to the random noise inherited in the imaging and hybridization process, identifying statistically significant DNA copy number changes in aCGH data is challenging. We propose a novel approach that uses the mean and variance change point model (MVCM) to detect CNVs or breakpoints in aCGH data sets. We derive an approximate p-value for the test statistic and also give the estimate of the locus of the DNA copy number change. We carry out simulation studies to evaluate the accuracy of the estimate and the p-value formulation. These simulation results show that the approach is effective in identifying copy number changes. The approach is also tested on fibroblast cancer cell line data, breast tumor cell line data, and breast cancer cell line aCGH data sets that are publicly available. Changes that have not been identified by the circular binary segmentation (CBS) method but are biologically verified are detected by our approach on these cell lines with higher sensitivity and specificity than CBS.
Statistical hypothesis testing, aCGH microarray data, gene expression, DNA copy numbers, CNVs.
Jie Chen, Yu-Ping Wang, "A Statistical Change Point Model Approach for the Detection of DNA Copy Number Variations in Array CGH Data", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.6, no. 4, pp. 529-541, October-December 2009, doi:10.1109/TCBB.2008.129