The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.01 - January-February (2011 vol.8)
pp: 182-193
Yufeng Wu , University of Connecticut, Storrs
ABSTRACT
Large amount of population-scale genetic variation data are being collected in populations. One potentially important biological problem is to infer the population genealogical history from these genetic variation data. Partly due to recombination, genealogical history of a set of DNA sequences in a population usually cannot be represented by a single tree. Instead, genealogy is better represented by a genealogical network, which is a compact representation of a set of correlated local genealogical trees, each for a short region of genome and possibly with different topology. Inference of genealogical history for a set of DNA sequences under recombination has many potential applications, including association mapping of complex diseases. In this paper, we present two new methods for reconstructing local tree topologies with the presence of recombination, which extend and improve the previous work in. We first show that the "tree scan” method can be converted to a probabilistic inference method based on a hidden Markov model. We then focus on developing a novel local tree inference method called RENT that is both accurate and scalable to larger data. Through simulation, we demonstrate the usefulness of our methods by showing that the hidden-Markov-model-based method is comparable with the original method in terms of accuracy. We also show that RENT is competitive with other methods in terms of inference accuracy, and its inference error rate is often lower and can handle large data.
INDEX TERMS
Population genetics, recombination, ancestral recombination graph, algorithm, hidden Markov model.
CITATION
Yufeng Wu, "New Methods for Inference of Local Tree Topologies with Recombinant SNP Sequences in Populations", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.8, no. 1, pp. 182-193, January-February 2011, doi:10.1109/TCBB.2009.27
REFERENCES
[1] V. Bafna and V. Bansal, "The Number of Recombination Events in a Sample History: Conflict Graph and Lower Bounds," IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 1, no. 2, pp. 78-90, Apr.-June 2004.
[2] V. Bafna and V. Bansal, "Inference about Recombination from Haplotype Data: Lower Bounds and Recombination Hotspots," J. Compuational Biology, vol. 13, pp. 501-521, 2006.
[3] M. Bordewich and C. Semple, "On the Computational Complexity of the Rooted Subtree Prune and Regraft Distance," Annals of Combinatorics, vol. 8, pp. 409-423, 2004.
[4] Z. Ding, T. Mailund, and Y.S. Song, "Efficient Whole-Genome Association Mapping Using Local Phylogenies for Unphased Genotype Data," Bioinformatics, vol. 24, no. 19, pp. 2215-2221, 2008.
[5] P. Fearnhead and P. Donnelly, "Estimating Recombination Rates from Population Genetic Data," Genetics, vol. 159, pp. 1299-1318, 2001.
[6] J. Felsenstein, Inferring Phylogenies. Sinauer, 2004.
[7] R.C. Griffiths and P. Marjoram, "Ancestral Inference from Samples of DNA Sequences with Recombination," J. Computational Biology, vol. 3, pp. 479-502, 1996.
[8] D. Gusfield, Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. Cambridge Univ. Press, 1997.
[9] D. Gusfield, "Optimal, Efficient Reconstruction of Root-Unknown Phylogenetic Networks with Constrained and Structured Recombination," J. Computer and System Sciences, vol. 70, pp. 381-398, 2005.
[10] D. Gusfield, S. Eddhu, and C. Langley, "Optimal, Efficient Reconstruction of Phylogenetic Networks with Constrained Recombination," J. Bioinformatics and Computational Biology, vol. 2, pp. 173-213, 2004.
[11] D. Gusfield, S. Eddhu, and C. Langley, "The Fine Structure of Galls in Phylogenetic Networks," INFORMS J. Computing, vol. 16, pp. 459-469, 2004.
[12] J. Hein, "Reconstructing Evolution of Sequences Subject to Recombination Using Parsimony," Math. Biosciences, vol. 98, pp. 185-200, 1990.
[13] J. Hein, "A Heuristic Method to Reconstruct the History of Sequences Subject to Recombination," J. Molecular Evolution, vol. 36, pp. 396-405, 1993.
[14] J. Hein, M. Schierup, and C. Wiuf, Gene Genealogies, Variation and Evolution: A Primer in Coalescent Theory. Oxford Univ. Press, 2005.
[15] D. Hinds, L. Stuve, G. Nilsen, E. Halperin, E. Eskin, D. Gallinger, K. Frazer, and D. Cox, "Whole-Genome Patterns of Common DNA Variation in Three Human Populations," Science, vol. 307, pp. 1072-1079, 2005.
[16] R. Hudson, "Properties of a Neutral Allele Model with Intragenic Recombination," Theoretical Population Biology, vol. 23, pp. 183-201, 1983.
[17] R. Hudson and N. Kaplan, "Statistical Properties of the Number of Recombination Events in the History of a Sample of DNA Sequences," Genetics, vol. 111, pp. 147-164, 1985.
[18] R. Hudson, "Generating Samples under the Wright-Fisher Neutral Model of Genetic Variation," Bioinformatics, vol. 18, no. 2, pp. 337-338, 2002.
[19] D. Husmeier and F. Wright, "Detection of Recombination in DNA Multiple Alignments with Hidden Markov Models," J. Computational Biology, vol. 7, pp. 407-421, 2001.
[20] Int'l HapMap Consortium, "A Haplotype Map of the Human Genome," Nature, vol. 437, pp. 1299-1320, 2005.
[21] Int'l HapMap Consortium, "A Second Generation Human Haplotype Map of over 3.1 Million SNPs," Nature, vol. 449, pp. 851-861, 2007.
[22] F. Larribe, S. Lessard, and N.J. Schork, "Gene Mapping via Ancestral Recombination Graph," Theoretical Population Biology, vol. 62, pp. 215-229, 2002.
[23] N. Li and M. Stephens, "Modeling Linkage Disequilibrium, and Identifying Recombination Hotspots Using SNP Data," Genetics, vol. 165, pp. 2213-2233, 2003.
[24] R. Lyngso, Y.S. Song, and J. Hein, "Minimum Recombination Histories by Branch and Bound," Proc. Workshop Algorithm of Bioinformatics (WABI), pp. 239-250, 2005.
[25] T. Mailund, S. Besenbacher, and M.H. Schierup, "Whole Genome Association Mapping by Incompatibilities and Local Perfect Phylogenies," BMC Bioinformatics, vol. 7, p. 454, 2006.
[26] G. McGuire, F. Wright, and M.J. Prentice, "A Bayesian Model for Detecting Past Recombination Events in DNA Multiple Alignments," J. Computational Biology, vol. 7, pp. 159-170, 2001.
[27] G.A.T. McVean and N.J. Cardin, "Approximating the Coalescent with Recombination," Philosophical Trans. Royal Soc., vol. 360, pp. 1387-1393, 2005.
[28] M. Minichiello and R. Durbin, "Mapping Trait Loci Using Inferred Ancestral Recombination Graphs," Am. J. Human Genetics, vol. 79, pp. 910-922, 2006.
[29] S.R. Myers and R.C. Griffiths, "Bounds on the Minimum Number of Recombination Events in a Sample History," Genetics, vol. 163, pp. 375-394, 2003.
[30] M. Norborg and S. Tavare, "Linkage Disequilibrium: What History Has to Tell Us," Trends in Genetics, vol. 18, pp. 83-90, 2002.
[31] A. Siepel and D. Haussler, "Combining Phylogenetic and Hidden Markov Models in Biosequence Analysis," Proc. Seventh Ann. Conf. Research in Computational Molecular Biology (RECOMB '03), pp. 277-286, 2003.
[32] Y.S. Song and J. Hein, "On the Minimum Number of Recombination Events in the Evolutionary History of DNA Sequences," J. Math. Biology, vol. 48, pp. 160-186, 2003.
[33] Y.S. Song, "On the Combinatorics of Rooted Binary Phylogenetic Trees," Annals of Combinatorics, vol. 7, pp. 365-379, 2003.
[34] Y.S. Song, Y. Wu, and D. Gusfield, "Efficient Computation of Close Lower and Upper Bounds on the Minimum Number of Needed Recombinations in the Evolution of Biological Sequences," Bioinformatics, vol. 421, pp. i413-i422, 2005.
[35] Y.S. Song and J. Hein, "Constructing Minimal Ancestral Recombination Graphs," J. Computational Biology, vol. 12, pp. 159-178, 2005.
[36] Y.S. Song, Z. Ding, D. Gusfield, C.H. Langley, and Y. Wu, "Algorithms to Distinguish the Role of Gene-Conversion from Single-Crossover Recombination in the Derivation of SNP Sequences in Populations," Proc. 10th Ann. Int'l Conf. Research in Computational Molecular Biology (RECOMB), pp. 231-245, 2006.
[37] L. Wang, K. Zhang, and L. Zhang, "Perfect Phylogenetic Networks with Recombination," J. Computational Biology, vol. 8, pp. 69-78, 2001.
[38] Y. Wu and D. Gusfield, "Efficient Computation of Minimum Recombination over Genotypes (not Haplotypes)," Proc. Computational Systems Bioinformatics (CSB), pp. 145-156, 2006.
[39] Y. Wu, "Association Mapping of Complex Diseases with Ancestral Recombination Graphs: Models and Efficient Algorithms," Proc. 11th Ann. Int'l Conf. Research in Computational Molecular Biology (RECOMB '07), pp. 488-502, 2007.
[40] C. Wiuf and J. Hein, "Recombination as a Point Process Along Sequences," Theoretical Population Biology, vol. 55, pp. 1217-1228, 1999.
[41] S. Zollner and J.K. Pritchard, "Coalescent-Based Association Mapping and Fine Mapping of Complex Trait Loci," Genetics, vol. 169, pp. 1071-1092, 2005.
19 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool