The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.05 - Sept.-Oct. (2013 vol.10)
pp: 1137-1149
Ming-Chi Tsai , CMU-Pitt PhD Program in Computational Biology, Pittsburgh
Guy Blelloch , Carnegie Mellon University, Pittsburgh
R. Ravi , Carnegie-Mellon University, Pittsburgh
Russell Schwartz , Carnegie Mellon University, Pittsburgh
ABSTRACT
Detecting and quantifying the timing and the genetic contributions of parental populations to a hybrid population is an important but challenging problem in reconstructing evolutionary histories from genetic variation data. With the advent of high throughput genotyping technologies, new methods suitable for large-scale data are especially needed. Furthermore, existing methods typically assume the assignment of individuals into subpopulations is known, when that itself is a difficult problem often unresolved for real data. Here, we propose a novel method that combines prior work for inferring nonreticulate population structures with an MCMC scheme for sampling over admixture scenarios to both identify population assignments and learn divergence times and admixture proportions for those populations using genome-scale admixed genetic variation data. We validated our method using coalescent simulations and a collection of real bovine and human variation data. On simulated sequences, our methods show better accuracy and faster runtime than leading competitive methods in estimating admixture fractions and divergence times. Analysis on the real data further shows our methods to be effective at matching our best current knowledge about the relevant populations.
INDEX TERMS
Sociology, Statistics, Bioinformatics, Genomics, Computational modeling,computations on discrete structures, Biology and genetics, graphs and networks, information theory
CITATION
Ming-Chi Tsai, Guy Blelloch, R. Ravi, Russell Schwartz, "Coalescent-Based Method for Learning Parameters of Admixture Events from Large-Scale Genetic Variation Data", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.10, no. 5, pp. 1137-1149, Sept.-Oct. 2013, doi:10.1109/TCBB.2013.98
REFERENCES
[1] E.J. Parra, A. Marcini, J. Akey, J. Martinson, M.A. Batzer, R. Cooper, T. Forrester, D.B. Allison, R. Deka, R.E. Ferrell, and M.D. Shriver, "Estimating African American Admixture Proportions by Use of Population-Specific Alleles," Am. J. Human Genetics, vol. 63, no. 6, pp. 1839-1851, 1998.
[2] B. Korber, M. Muldoon, J. Theiler, F. Gao, R. Gupta, A. Lapedes, B.H. Hahn, S. Wolinsky, and T. Bhattacharya, "Timing the Ancestor of the Hiv-1 Pandemic Strains," Science, vol. 288, no. 5472, pp. 1789-1796, 2000.
[3] D.B. Goldstein and L. Chikhi, "Human Migrations and Population Structure: What We Know and Why It Matters," Ann. Rev. of Genomics and Human Genetics, vol. 3, no. 1, pp. 129-152, 2002.
[4] I. Dupanloup, G. Bertorelle, L. Chikhi, and G. Barbujani, "Estimating the Impact of Prehistoric Admixture on the Genome of Europeans," Molecular Biology and Evolution, vol. 21, no. 7, pp. 1361-1372, 2004.
[5] K. Bryc, A. Auton, M.R. Nelson, J.R. Oksenberg, S.L. Hauser, S. Williams, A. Froment, J.-M. Bodo, C. Wambebe, S.A. Tishkoff, and C.D. Bustamante, "Genome-Wide Patterns of Population Structure and Admixture in West Africans and African Americans," Proc. Nat'l Academy of Sciences USA, vol. 107, no. 2, pp. 786-791, 2010.
[6] O. Francois, M. Currat, N. Ray, E. Han, L. Excoffier, and J. Novembre, "Principal Component Analysis under Population Genetic Models of Range Expansion and Admixture," Molecular Biology and Evolution, vol. 27, no. 6, pp. 1257-1268, 2010.
[7] J.K. Pritchard, M. Stephens, and P. Donnelly, "Inference of Population Structure Using Multilocus Genotype Data," Genetics, vol. 155, no. 2, pp. 945-959, 2000.
[8] A.L. Price, A. Tandon, N. Patterson, K.C. Barnes, N. Rafaels, I. Ruczinski, T.H. Beaty, R. Mathias, D. Reich, and S. Myers, "Sensitive Detection of Chromosomal Segments of Distinct Ancestry in Admixed Populations," PLoS Genetics, vol. 5, no. 6,article e1000519, 2009.
[9] S. Sankararaman, S. Sridhar, G. Kimmel, and E. Halperin, "Estimating Local Ancestry in Admixed Populations," Am. J. Human Genetics, vol. 82, no. 2, pp. 290-303, 2008.
[10] R. Nielsen and J. Wakeley, "Distinguishing Migration from Isolation: A Markov Chain Monte Carlo Approach," Genetics, vol. 158, no. 2, pp. 885-896, 2001.
[11] H. Li and R. Durbin, "Inference of Human Population History from Individual Whole-Genome Sequences," Nature, vol. 475, no. 7357, pp. 493-496, 2011.
[12] R. Chakraborty, "Gene Admixture in Human Populations: Models and Predictions," Am. J. Physical Anthropology, vol. 29, no. S7, pp. 1-43, 1986.
[13] J. Wang, "A Coalescent-Based Estimator of Admixture from DNA Sequences," Genetics, vol. 173, no. 3, pp. 1679-1692, 2006.
[14] L. Chikhi, M. Bruford, and M. Beaumont, "Estimation of Admixture Proportions: A Likelihood-Based Approach Using Markov Chain Monte Carlo," Genetics, vol. 158, no. 3, pp. 1347-1362, 2001.
[15] G. Bertorelle and L. Excoffier, "Inferring Admixture Proportions from Molecular Data." Molecular Biology and Evolution, vol. 15, no. 10, pp. 1298-1311, 1998.
[16] A. Tenesa, P.N.B.J. Hayes, D.L. Duffy, G.M. Clarke, M.E. Goddard, and P.M. Visscher, "Recent Human Effective Population Size Estimated from Linkage Disequilibrium," Genome Research, vol. 17, no. 4, pp. 520-526, 2007.
[17] M. Mele, A. Javed, M. Pybus, P. Zalloua, M. Haber, D. Comas, M.G. Netea, O. Balanovsky, E. Balanovska, L. Jin, Y. Yang, R. Pitchappan, G. Arunkumar, L. Parida, F. Calafell, J. Bertranpetit, and T.G. Consortium, "Recombination Gives a New Insight in the Effective Population Size and the History of the Old World Human Populations," Molecular Biology and Evolution, vol. 29, pp. 25-30, 2011.
[18] M.-C. Tsai, G.E. Blelloch, R. Ravi, and R. Schwartz, "A Consensus Tree Approach for Reconstructing Human Evolutionary History and Detecting Population Substructure," IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 8, no. 4, pp. 918-928, July-Aug. 2011.
[19] M. Nei and S. Kumar, Molecular Evolution and Phylogenetics. Oxford Univ. Press, 2000.
[20] P. Grünwald, I. Myung, and M. Pitt, Advances in Minimum Description Length: Theory and Applications. MIT Press, 2005.
[21] R. Hudson, "Gene Genealogies and the Coalescent Process," Oxford Surveys in Evolutionary Biology, vol. 7, pp. 1-44, 1990.
[22] R.C. Hardison et al., "Covariation in Frequencies of Substitution, Deletion, Transposition, and Recombination during Eutherian Evolution," Genome Research, vol. 13, no. 1, pp. 13-26, 2003.
[23] G. Liu, L. Matukumalli, T. Sonstegard, L. Shade, and C. Van Tassell, "Genomic Divergences among Cattle, Dog and Human Estimated from Large-Scale Alignments of Genomic Sequences," BMC Genomics, vol. 7, no. 1,article 140, 2006.
[24] The Bovine HapMap Consortium, "Genome-Wide Survey of SNP Variation Uncovers the Genetic Structure of Cattle Breeds," Science, vol. 324, no. 5926, pp. 528-532, 2009.
[25] D. Altshuler, E. Lander, L. Ambroglio, T. Bloom, K. Cibulskis, T.J. Fennell, S.B. Gabriel, D.B. Jaffe, E. Shefler, C.L. Sougnez, C. Lee, R.E. Mills, X. Shi, M.J. Daly, M.A. DePristo, A.D. Ball, E. Banks, B.L. Browning, K.V. Garimella, S.R. Grossman, R.E. Handsaker, M. Hanna, C. Hartl, A.M. Kernytsky, J.M. Korn, H. Li, J.R. Maguire, S.A. McCarroll, J.C. Nemesh, A. McKenna, A.A. Philippakis, R.E. Poplin, A. Price, M.A. Rivas, P.C. Sabeti, S. Schaffner, and I. Shlyakhter, "A Map of Human Genome Variation from Population-Scale Sequencing," Nature, vol. 467, no. 7319, pp. 1061-1073, 2010.
[26] Int'l HapMap Consortium, "A Second Generation Human Haplotype Map of Over 3.1 Million SNPs," Nature, vol. 449, no. 7164, pp. 851-861, Oct. 2007.
[27] S. Kumar and S. Subramanian, "Mutation Rates in Mammalian Genomes," Proc. Nat'l Academy of Sciences USA, vol. 99, no. 2, pp. 803-808, 2002.
[28] H. Rangel-Villalobos, J.F. Muoz-Valle, A. Gonzlez-Martn, A. Gorostiza, M.T. Magaa, and L.A. Pez-Riberos, "Genetic Admixture, Relatedness, and Structure Patterns among Mexican Populations Revealed by the y-Chromosome," Am. J. Physical Anthropology, vol. 135, no. 4, pp. 448-461, 2008.
[29] H. Tang, S. Choudhry, R. Mei, M. Morgan, W. Rodriguez-Cintron, E.G. Burchard, and N.J. Risch, "Recent Genetic Selection in the Ancestral Admixture of Puerto Ricans," Am. J. Human Genetics, vol. 81, no. 3, pp. 626-633, 2007.
[30] G. Martinez-Cortes, J. Salazar-Flores, L.G. Fernandez-Rodriguez, R. Rubi-Castellanos, C. Rodriguez-Loya, J.S. Velarde-Felix, J.F. Munoz-Valle, I. Parra-Rojas, and H. Rangel-Villalobos, "Admixture and Population Structure in Mexican-Mestizos Based on Paternal Lineages," J. Human Genetics, vol. 57, pp. 568-574, 2012.
[31] D. Garrigan, S.B. Kingan, M.M. Pilkington, J.A. Wilder, M.P. Cox, H. Soodyall, B. Strassmann, G. Destro-Bisol, P. de Knijff, A. Novelletto, J. Friedlaender, and M.F. Hammer, "Inferring Human Population Sizes, Divergence Times and Rates of Gene Flow from Mitochondrial, x and y Chromosome Resequencing Data," Genetics, vol. 177, no. 4, pp. 2195-2207, 2007.
[32] L.A. Zhivotovsky, "Estimating Divergence Time with the Use of Microsatellite Genetic Distances: Impacts of Population Growth and Gene Flow," Molecular Biology and Evolution, vol. 18, no. 5, pp. 700-709, 2001.
[33] M.F. Hammer, "A Recent Common Ancestry for Human Y Chromosomes," Nature, vol. 378, no. 6555, pp. 376-378, 1995.
[34] R.N. Gutenkunst, R.D. Hernandez, S.H. Williamson, and C.D. Bustamante, "Inferring the Joint Demographic History of Multiple Populations from Multidimensional SNP Frequency Data," PLoS Genetics, vol. 5, no. 10,article e1000695, 2009.
[35] S.H. Lee, Y.M. Cho, D. Lim, H.C. Kim, B.H. Choi, H.S. Park, O.H. Kim, S. Kim, T.H. Kim, D. Yoon, and S.K. Hong, "Linkage Disequilibrium and Effective Population Size in Hanwoo Korean Cattle," Asian-Australasian J. Animal Sciences, vol. 24, no. 12, pp. 1660-1665, 2011.
83 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool