loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
The Number of Recombination Events in a Sample History: Conflict Graph and Lower Bounds
April-June 2004 (vol. 1 no. 2)
pp. 78-90
We consider the following problem: Given a set of binary sequences, determine lower bounds on the minimum number of recombinations required to explain the history of the sample, under the infinite-sites model of mutation. The problem has implications for finding recombination hotspots and for the Ancestral Recombination Graph reconstruction problem [29]. Hudson and Kaplan [15] gave a lower bound based on the four-gamete test. In practice, their bound R_m often greatly underestimates the minimum number of recombinations. The problem was recently revisited by Myers and Griffiths [22], who introduced two new lower bounds R_h and R_s which are provably better, and also yield good bounds in practice. However, the worst-case complexities of their procedures for computing R_h and R_s are exponential and super-exponential, respectively. In this paper, we show that the number of nontrivial connected components, R_c, in the conflict graph [4] for a given set of sequences, computable in time O(nm^2), is also a lower bound on the minimum number of recombination events. We show that in many cases, R_c is a better bound than R_h. The conflict graph was used by Gusfield et al. [4] to obtain a polynomial time algorithm for the galled tree problem, which is a special case of the Ancestral Recombination Graph (ARG) reconstruction problem. Our results also offer some insight into the structural properties of this graph and are of interest for the general Ancestral Recombination Graph reconstruction problem.

[1] 78 A.G. Clark, K.M. Weiss, D.A. Nickerson, S.L. Taylor, A. Buchanan, J. Stengard, V. Salomaa, E. Vartiainen, M. Perola, E. Boerwinkle, and C.F. Sing, “Haplotype Structure and Population Genetic Inferences from Nucleotide-Sequence Variation in Human Lipoprotein Lipase,” Am. J. Human Genetics, vol. 63, pp. 595-612, 1998.[2] D.C. Crawford, T. Bhangale, N. Li, G. Hellenthal, M.J. Rieder, D.A. Nickerson, and M. Stephens, “Evidence for Substantial Fine-Scale Variation in Recombination Rates across the Human Genome,” Nature Genetics, vol. 36, pp. 700-706, 2004.[3] M.J. Daly, J.D. Rioux, S.F. Schaffner, T.J. Hudson, and E.S. Lander, “High-Resolution Haplotype Structure in the Human Genome,” Nature Genetics, vol. 29, pp. 229-232, 2001.[4] D. Gusfield, S. Eddhu, and C. Langley, “Efficient Reconstruction of Phylogenetic Networks with Constrained Recombination,” Proc. IEEE Computer Soc. Bioinformatics Conf., pp. 363-374, 2003.[5] P. Fearnhead and P. Donnelly, “Estimating Recombination Rates from Population Genetic Data,” Genetics, vol. 159, pp. 1299-1318, 2001.[6] P. Fearnhead, R.M. Harding, J.A. Schneider, S. Myers, and P. Donnelly, “Application of Coalescent Methods to Reveal Fine-Scale Rate Variation and Recombination Hotspots,” Genetics, vol. 167, pp. 2067-2081, 2004.[7] S.B. Gabriel, S.F. Schaffner, H. Nguyen, J.M. Moore, J. Roy, B. Blumenstiel, J. Higgins, M. DeFelice, A. Lochner, M. Faggart, S.N. Liu-Cordero, C. Rotimi, A. Adeyemo, R. Cooper, R. Ward, E.S. Lander, M.J. Daly, and D. Altschuler, “The Structure of Haplotype Blocks in the Human Genome,” Science, vol. 296, no. 5576, pp. 2225-2229, 2002.[8] R.C. Griffiths and P. Marjoram, “Ancestral Inference from Samples of DNA Sequences with Recombination,” J. Computational Biology, vol. 3, no. 4, pp. 479-502, 1996.[9] D. Gusfield, “Efficient Algorithms for Inferring Evolutionary Trees,” Networks, vol. 21, pp. 19-28, 1991.[10] D. Gusfield, “Optimal, Efficient Reconstruction of Root-Unknown Phylogenetic Networks with Constrained and Structured Recombination,” technical report, Univ. of California at Davis, 2004.[11] D. Gusfield and D. Hickerson, “A Fundamental, Efficiently-Computed Lower Bound on the Number of Recombinations Needed in Phylogenetic Networks,” technical report, Univ. of California at Davis, 2004.[12] The Int'l HapMap Consortium, The Int'l HapMap Project, Nature, vol. 426, pp. 789-796, http:/www.hapmap.org/, 2003.[13] J. Hein, “A Heuristic Method to Reconstruct the History of Sequences Subject to Recombination,” J. Molecular Evolution, vol. 20, pp. 402-411, 1993.[14] R.R. Hudson, “Two-Locus Sampling Distributions and Their Applications,” Genetics, vol. 159, pp. 1805-1817, 2001.[15] R.R. Hudson and N.L. Kaplan, “Statistical Properties of the Number of Recombination Events in the History of a Sample of DNA Sequences,” Genetics, vol. 111, pp. 147-164, 1985.[16] Int'l Human Genome Sequencing Consortium, “Initial Sequencing and Analysis of the Human Genome,” Nature, vol. 409, pp. 860-921, 2001.[17] A.J. Jeffreys, A. Ritchie, and R. Neumann, “High Resolution Analysis of Haplotype Diversity and Meiotic Crossover in the Human Tap2 Recombination Hotspot,” Human Molecular Genetics, vol. 9, pp. 725-733, 2000.[18] M. Kreitman, “Nucleotide Polymorphism at the Alcohol Dehydrogenase Locus of Drosophila Melanogaster,” Genetics, vol. 11, pp. 147-164, 1985.[19] N. Li and M. Stephens, “Modeling Linkage Disequilibrium and Identifying Recombination Hotspots Using Single-Nucleotide Polymorphism Data,” Genetics, vol. 165, pp. 2213-2233, 2003.[20] G.A. McVean, T.P. Awadella, and P. Fearnhead, “A Coalescent Method for Detecting Recombination from Gene Sequences,” Genetics, vol. 160, pp. 1231-1241, 2002.[21] G.A. McVean, S.R. Myers, S. Hunt, P Deloukas, D.R. Bentley, and P. Donnelly, “The Fine-Scale Structure of Recombination Rate Variation in the Human Genome,” Science, vol. 304, pp. 581-584, 2004.[22] S.R. Myers and R.C. Griffiths, “Bounds on the Minimum Number of Recombination Events in a Sample History,” Genetics, vol. 163, pp. 375-394, 2003.[23] D.A. Nickerson, S.L. Taylor, S.M. Fullerton, K.M. Weiss, A.G. Clark, J.H. Stengaard, V. Salomaa, E. Boerwinkle, and C.F. Sing, “Sequence Diversity and Large-Scale Typing of SNPS in the Human Apolipoprotein E Gene,” Genome Research, vol. 10, pp. 1532-1545, 2000.[24] T.D. Petes, “Meiotic Recombination Hot Spots and Cold Spots,” Nature Rev. Genetics, vol. 1, pp. 360-369, 2001.[25] “Genomics of Cardiovascular Development, Adaptation, and Remodeling,” NHLBI Program for Genomic Applications, Harvard Medical School, http:/cardiogenomics.org. Sept. 2004.[26] Y.S. Song and J. Hein, “Parsimonious Reconstruction of Sequence Evolution and Haplotype Blocks: Finding the Minimum Number of Recombination Events,” Proc. Conf. Algorithms in Bioinformatics, WABI 2003, pp. 287-302, 2003.[27] Y.S. Song and J. Hein, “On the Minimum Number of Recombination Events in the Evolutionary History of DNA Sequences,” J. Math. Biology, vol. 48, pp. 160-186, 2004.[28] G. Venter et al., “The Sequence of the Human Genome,” Science, vol. 291, pp. 1304-1351, 2001.[29] L. Wang, K. Zhang, and L. Zhang, “Perfect Phylogenetic Networks with Recombination,” J. Computational Biology, vol. 8, no. 1, pp. 69-78, 2001.

Index Terms:
Recombination, phylogenetic networks, ancestral recombination graph, haplotypes, lower bounds, conflict graph, NP-completeness.
Citation:
Vineet Bafna, Vikas Bansal, "The Number of Recombination Events in a Sample History: Conflict Graph and Lower Bounds," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 1, no. 2, pp. 78-90, Apr.-June 2004, doi:10.1109/TCBB.2004.23
Usage of this product signifies your acceptance of the Terms of Use.