This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Topological Rearrangements and Local Search Method for Tandem Duplication Trees
January-March 2005 (vol. 2 no. 1)
pp. 15-28
The problem of reconstructing the duplication history of a set of tandemly repeated sequences was first introduced by Fitch [4]. Many recent studies deal with this problem, showing the validity of the unequal recombination model proposed by Fitch, describing numerous inference algorithms, and exploring the combinatorial properties of these new mathematical objects, which are duplication trees. In this paper, we deal with the topological rearrangement of these trees. Classical rearrangements used in phylogeny (NNI, SPR, TBR, ...) cannot be applied directly on duplication trees. We show that restricting the neighborhood defined by the SPR (Subtree Pruning and Regrafting) rearrangement to valid duplication trees, allows exploring the whole duplication tree space. We use these restricted rearrangements in a local search method which improves an initial tree via successive rearrangements. This method is applied to the optimization of parsimony and minimum evolution criteria. We show through simulations that this method improves all existing programs for both reconstructing the topology of the true tree and recovering its duplication events. We apply this approach to tandemly repeated human Zinc finger genes and observe that a much better duplication tree is obtained by our method than using any other program.

[1] F. Blattner , G. Plunkett , C. Bloch , N. Perna , V. Burland , M. Riley , J. Collado-Vides , J. Glasner , C. Rode , G. Mayhew , J. Gregor , N. Davis , H. Kirkpatrick , M. Goeden , D. Rose , B. Mau , and Y. Shao , “The Complete Genome Sequence Of Escherichia Coli k-12,” Science, vol. 277, no. 5331, pp. 1453-1474, 1997.
[2] E. Lander et al., “Initial Sequencing and Analysis of the Human Genome,” Nature, vol. 409, pp. 860-921, 2001.
[3] A. Smit , “Interspersed Repeats and Other Mementos of Transposable Elements in Mammalian Genomes,” Current Opinion in Genetics & Development, vol. 9, pp. 657-663, 1999.
[4] W. Fitch , “Phylogenies Constrained by Cross-Over Process as Illustrated by Human Hemoglobins in a Thirteen-Cycle, Eleven Amino-Acid Repeat in Human Apolipoprotein A-I,” Genetics, vol. 86, pp. 623-644, 1977.
[5] G. Levinson and G. Gutman , “Slipped-Strand Mispairing: A Major Mechanism for DNA Sequence Evolution,” Molecular Biology and Evolution, vol. 4, pp. 203-221, 1987.
[6] J. Zhang and M. Nei , “Evolution of Antennapedia-Class Homeobox Genes,” Genetics, vol. 142, no. 1, pp. 295-303, 1996.
[7] O. Elemento and O. Gascuel , “An Exact and Polynomial Distance-Based Algorithm to Reconstruct Single Copy Tandem Duplication Trees,” Proc. 14th Ann. Symp. Combinatorial Pattern Matching (CPM2003), 2003.
[8] O. Elemento , O. Gascuel , and M.-P. Lefranc , “Reconstructing the Duplication History of Tandemly Repeated Genes,” Molecular Biology and Evolution, vol. 19, pp. 278-288, 2002.
[9] G. Benson and L. Dong , “Reconstructing the Duplication History of a Tandem Repeat,” Proc. Intelligent Systems in Molecular Biology (ISMB1999), T. Lengauer, ed., pp. 44-53, 1999.
[10] M. Tang , M. Waterman , and S. Yooseph , “Zinc Finger Gene Clusters and Tandem Gene Duplication,” J. Computational Biology, vol. 9, pp. 429-446, 2002.
[11] E. Rivals , “A Survey on Algorithmic Aspects of Tandem Repeats Evolution,” Int'l J. Foundations of Computer Science, vol. 15, no. 2, pp. 225-257, 2004.
[12] O. Gascuel , D. Bertrand , and O. Elemento , “Reconstructing the Duplication History of Tandemly Repeated Sequences,” Math. of Evolution and Phylogeny, O. Gascuel, ed., 2004.
[13] S. Ohno , Evolution by Gene Duplication. Springer Verlag, 1970.
[14] P.L. Fleche , Y. Hauck , L. Onteniente , A. Prieur , F. Denoeud , V. Ramisse , P. Sylvestre , G. Benson , F. Ramisse , and G. Vergnaud , “A Tandem Repeats Database for Bacterial Genomes: Application to the Genotyping of Yersinia Pestis and Bacillus Anthracis, ” BioMed Central Microbiology, vol. 1, pp. 2-15, 2001.
[15] D. Jaitly , P. Kearney , G. Lin , and B. Ma , “Methods for Reconstructing the History of Tandem Repeats and Their Application to the Human Genome,” J. Computer and System Sciences, vol. 65, pp. 494-507, 2002.
[16] P. Sneath and R. Sokal , Numerical Taxonomy. pp. 230-234, San Francisco: W.H. Freeman and Company, 1973.
[17] N. Saitou and M. Nei , “The Neighbor-Joining Method: A New Method for Reconstructing Phylogenetic Trees,” Molecular Biology and Evolution, vol. 4, pp. 406-425, 1987.
[18] O. Elemento and O. Gascuel , “A Fast and Accurate Distance-Based Algorithm to Reconstruct Tandem Duplication Trees,” Bioinformatics, vol. 18, pp. 92-99, 2002.
[19] J. Barthélemy and A. Guénoche , Trees and Proximity Representations. Wiley and Sons, 1991.
[20] S. Sattath and A. Tversky , “Additive Similarity Trees,” Psychometrika, vol. 42, pp. 319-345, 1977.
[21] L. Zhang , B. Ma , L. Wang , and Y. Xu , “Greedy Method for Inferring Tandem Duplication History,” Bioinformatics, vol. 19, pp. 1497-1504, 2003.
[22] D. Swofford , P. Olsen , P. Waddell , and D. Hillis , Molecular Systematics. pp. 407-514, Sunderland, Mass.: Sinauer Associates, 1996.
[23] D. Swofford , PAUP*. Phylogenetic Analysis Using Parsimony (*and Other Methods), version 4. Sunderland, Mass.: Sinauer Associates, 1999.
[24] J. Felsenstein , “PHYLIP— PHYLogeny Inference Package,” Cladistics, vol. 5, pp. 164-166, 1989.
[25] C. Semple and M. Steel , Phylogenetics. Oxford Univ. Press, 2003.
[26] O. Gascuel , M. Hendy , A. Jean-Marie , and S. McLachlan , “The Combinatorics of Tandem Duplication Trees,” Systematic Biology, vol. 52, pp. 110-118, 2003.
[27] J. Yang and L. Zhang , “On Counting Tandem Duplication Trees,” Molecular Biology and Evolution, vol. 21, pp. 1160-1163, 2004.
[28] D. Robinson , “Comparison of Labeled Trees with Valency Trees,” J. Combinatorial Theory, vol. 11, pp. 105-119, 1971.
[29] L. Wang and D. Gusfield , “Improved Approximation Algorithms for Tree Alignment,” J. Algorithms, vol. 25, pp. 255-273, 1997.
[30] Y. Pauplin , “Direct Calculation of a Tree Length Using a Distance Matrix,” J. Molecular Evolution, vol. 51, pp. 41-47, 2000.
[31] R. Desper and O. Gascuel , “Theoretical Foundation of the Balanced Minimum Evolution Method of Phylogenetic Inference and Its Relationship to Weighted Least-Squares Tree Fitting,” Molecular Biology and Evolution, vol. 21, no. 3, pp. 587-598, 2004.
[32] W. Fitch , “Toward Defining the Course of Evolution: Minimum Change for a Specified Tree Topology,” Systematic Zoology, vol. 20, pp. 406-416, 1971.
[33] J. Hartigan , “Minimum Mutation Fits to a Given Tree,” Biometrics, vol. 29, pp. 53-65, 1973.
[34] G. Ganapathy , V. Ramachandran , and T. Warnow , “Better Hill-Climbing Searches for Parsimony,” Proc. Third Int'l Workshop Algorithms in Bioinformatics, 2003.
[35] P.A. Goloboff , “Methods for Faster Parsimony Analysis,” Cladistics, vol. 12, pp. 199-220, 1996.
[36] V. Berry and O. Gascuel , “Inferring Evolutionary Trees with Strong Combinatorial Evidence,” Theoretical Computer Science, vol. 240, pp. 271-298, 2000.
[37] M. Kimura , “A Simple Model for Estimating Evolutionary Rates of Base Substitutions through Comparative Studies of Nucleotide Sequences,” J. Molecular Evolution, vol. 16, pp. 111-120, 1980.
[38] D. Jones , W. Taylor , and J. Thornton , “The Rapid Generation of Mutation Data Matrices from Protein Sequences,” Computer Applications in Biosciences, vol. 8, pp. 275-282, 1992.
[39] K. Kidd and L. Sgaramella-Zonta , “Phylogenetic Analysis: Concepts and Methods,” Am. J. Human Genetics, vol. 23, pp. 235-252, 1971.
[40] A. Rzhetsky and M. Nei , “Theoretical Foundation of the Minimum-Evolution Method of Phylogenetic Inference,” Molecular Biology and Evolution, vol. 10, pp. 173-1095, 1993.
[41] W. Day , “Computational Complexity of Inferring Phylogenies from Dissimilarity Matrices,” Bull. Math. Biology, vol. 49, pp. 461-467, 1987.
[42] C. Semple and M. Steel , “Cyclic Permutations and Evolutionary Trees,” Advances in Applied Math., vol. 32, no. 4, pp. 669-680, 2004.
[43] R. Desper and O. Gascuel , “Fast and Accurate Phylogeny Reconstruction Algorithms Based on the Minimum-Evolution Principle,” J. Computational Biology, vol. 9, pp. 687-706, 2002.
[44] M. Kuhner and J. Felsenstein , “A Simulation Comparison of Phylogeny Algorithms under Equal and Unequal Evolutionary Rates,” Molecular Biology and Evolution, vol. 11, pp. 459-468, 1994.
[45] A. Rambault and N. Grassly , “Seq-Gen: An Application for the Monte Carlo Simulation of DNA Sequence Evolution Along Phylogenetic Trees,” Computer Applied Biosciences, vol. 13, pp. 235-238, 1997.
[46] J. Felsenstein and G. Churchill , “A Hidden Markov Model Approach to Variation Among Sites in Rate of Evolution,” Molecular Biology and Evolution, vol. 13, pp. 93-104, 1996.
[47] P.A. Goloboff , J.S. Farris , and K. Nixon , “TNT: Tree Analysis Using New Technology,” 2000, www.cladistics.com.
[48] T. El-Barabi and T. Pieler , “Zinc Finger Proteins: What We Know and What We Would Like to Know,” Mechanisms of Development, vol. 33, pp. 155-169, 1991.
[49] M. Shannon , J. Kim , L. Ashworth , E. Branscomb , and L. Stubbs , “Tandem Zinc-Finger Gene Families in Mammals: Insights and Unanswered Questions,” DNA Sequence— The J. Sequencing and Mapping, vol. 8, no. 5, pp. 303-315, 1998.
[50] P. Waddel and M. Steel , “General Time Reversible Distances with Unequal Rates Across Sites: Mixing T and Inverse Gaussian Distributions with Invariant Sites,” Molecular Phylogeny and Evolution, vol. 8, pp. 398-414, 1997.
[51] K.C. Nixon , “The Parsimony Ratchet, a New Method for Rapid Parsimony Analysis,” Cladistics, vol. 15, pp. 407-414, 1999.
[52] S. Guindon and O. Gascuel , “A Simple, Fast and Accurate Method to Estimate Large Phylogenies by Maximum-Likelihood,” Systematic Biology, vol. 52, no. 5, pp. 696-704, 2003.
[53] J. Felsenstein , “Cases in Which Parsimony or Compatibility Methods Will Be Positively Misleading,” Systematic Zoology, vol. 27, pp. 401-410, 1978.
[54] D. Page and M. Charleston , “From Gene to Organismal Phylogeny: Reconciled Trees and the Gene Tree/Species Tree Problem,” Molecular Phylogenetics and Evolution, vol. 7, pp. 231-240, 1997.
[55] M. Hallett , J. Lagergren , and A. Tofigh , “Simultaneous Identification of Duplications and Lateral Transfers,” Proc. Conf. Research and Computational Molecular Biology (RECOMB2004), pp. 347-356, 2004.

Index Terms:
Tandem duplication trees, phylogeny, topological rearrangements, local search, parsimony, minimum evolution, Zinc finger genes.
Citation:
Denis Bertrand, Olivier Gascuel, "Topological Rearrangements and Local Search Method for Tandem Duplication Trees," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 2, no. 1, pp. 15-28, Jan.-March 2005, doi:10.1109/TCBB.2005.15
Usage of this product signifies your acceptance of the Terms of Use.