This Article 
 Bibliographic References 
 Add to: 
Exact Computation of Coalescent Likelihood for Panmictic and Subdivided Populations under the Infinite Sites Model
October-December 2010 (vol. 7 no. 4)
pp. 611-618
Yufeng Wu, University of Connecticut, Storrs
Coalescent likelihood is the probability of observing the given population sequences under the coalescent model. Computation of coalescent likelihood under the infinite sites model is a classic problem in coalescent theory. Existing methods are based on either importance sampling or Markov chain Monte Carlo and are inexact. In this paper, we develop a simple method that can compute the exact coalescent likelihood for many data sets of moderate size, including real biological data whose likelihood was previously thought to be difficult to compute exactly. Our method works for both panmictic and subdivided populations. Simulations demonstrate that the practical range of exact coalescent likelihood computation for panmictic populations is significantly larger than what was previously believed. We investigate the application of our method in estimating mutation rates by maximum likelihood. A main application of the exact method is comparing the accuracy of approximate methods. To demonstrate the usefulness of the exact method, we evaluate the accuracy of program Genetree in computing the likelihood for subdivided populations.

[1] M. Bahlo and R.C. Griffiths, "Inference from Gene Trees in a Subdivided Population," Theoretical Population Biology, vol. 57, pp. 79-95, 2000.
[2] T.A. Davis, "A Column Pre-Ordering Strategy for the Unsymmetric-Pattern Multifrontal Method," ACM Trans. Math. Software, vol. 30, pp. 165-195, 2004.
[3] S. Ethier and R. Griffiths, "The Infinitely-Many-Sites Model as a Measure Valued Diffusion," Annals of Probability, vol. 15, pp. 515-545, 1987.
[4] W.J. Ewens, "The Sampling Theory of Selectively Neutral Alleles," Theoretical Population Biology, vol. 3, pp. 87-112, 1972.
[5] R. Griffiths and S. Tavarè, "Simulating Probability Distributions in the Coalescent," Theoretical Population Biology, vol. 46, pp. 131-159, 1994.
[6] R.C. Griffiths, P.A. Jenkins, and Y.S. Song, "Importance Sampling and Two-Locus Model with Subdivided Population Structure," Advances in Applied Probability, vol. 40, pp. 473-500, 2008.
[7] R.C. Griffiths and S. Tavarè, "Ancestral Inference in Population Genetics," Statistical Science, vol. 9, pp. 307-319, 1994.
[8] D. Gusfield, "Efficient Algorithms for Inferring Evolutionary History," Networks, vol. 21, pp. 19-28, 1991.
[9] J. Hein, M. Schierup, and C. Wiuf, Gene Genealogies, Variation and Evolution: A Primer in Coalescent Theory. Oxford Univ. Press, 2005.
[10] A. Hobolth, M.K. Uyenoyama, and C. Wiuf, "Importance Sampling for the Infinite Sites Model," Statistical Applications in Genetics and Molecular Biology, vol. 7, issue 1, article 32, 2008.
[11] R. Hudson, "Generating Samples under the Wright-Fisher Neutral Model of Genetic Variation," Bioinformatics, vol. 18, no. 2, pp. 337-338, 2002.
[12] J.F.C. Kingman, "The Coalescent," Stochastic Processes and their Applications, vol. 13, pp. 235-248, 1982.
[13] M.K. Kuhner, J. Yamato, and J. Felsenstein, "Estimating Effective Population Size and Mutation Rate from Sequence Data Using Metropolis-Hastings Sampling," Genetics, vol. 140, pp. 1421-1430, 1995.
[14] R. Lyngso, Y.S. Song, and J. Hein, "Accurate Computation of Likelihoods in the Coalescent with Recombination via Parsimony," Proc. Research in Computational Molecular Biology (RECOMB '08), pp. 463-477, 2008.
[15] Y. Song, R. Lyngsoe, and J. Hein, "Counting All Possible Ancestral Configurations of Sample Sequences in Population Genetics," IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 3, no. 3, pp. 239-251, July-Sept. 2006.
[16] M. Stephens and P. Donnelly, "Inference in Molecular Population Genetics," J. Royal Statistical Soc., vol. 62, pp. 605-655, 2000.
[17] S. Tavarè, "Ancestral Inference in Population Genetics," Lectures on Probability Theory and Statistics, Lecture Notes in Mathematics, vol. 1837, pp. 1-188, Springer, 2004.
[18] J. Wakeley, Coalescent Theory: An Introduction. Roberts and Company Publishers, 2008.
[19] R. Ward, B. Frazier, K. Dew, and S. Paabo, "Extensive Mitochondria Diversity within a Single Amerindian Tribe," Proc. Nat'l Academy of Science, vol. 88, pp. 8720-8724, 1991.
[20] G.A. Watterson, "On the Number of Segregating Sites in Genetical Models without Recombination," Theoretical Population Biology, vol. 7, pp. 256-276, 1975.

Index Terms:
Population genetics, coalescent theory, algorithms, subdivided population.
Yufeng Wu, "Exact Computation of Coalescent Likelihood for Panmictic and Subdivided Populations under the Infinite Sites Model," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 7, no. 4, pp. 611-618, Oct.-Dec. 2010, doi:10.1109/TCBB.2010.2
Usage of this product signifies your acceptance of the Terms of Use.