The Community for Technology Leaders
Subscribe
Issue No.04 - October-December (2009 vol.6)
pp: 552-569
Gabriel Cardona , University of the Balearic Islands, Palma de Mallorca
Francesc Rosselló , University of the Balearic Islands, Palma de Mallorca
Gabriel Valiente , Technical University of Catalonia, Barcelona
ABSTRACT
Phylogenetic networks are a generalization of phylogenetic trees that allow for the representation of nontreelike evolutionary events, like recombination, hybridization, or lateral gene transfer. While much progress has been made to find practical algorithms for reconstructing a phylogenetic network from a set of sequences, all attempts to endorse a class of phylogenetic networks (strictly extending the class of phylogenetic trees) with a well-founded distance measure have, to the best of our knowledge and with the only exception of the bipartition distance on regular networks, failed so far. In this paper, we present and study a new meaningful class of phylogenetic networks, called tree-child phylogenetic networks, and we provide an injective representation of these networks as multisets of vectors of natural numbers, their path multiplicity vectors. We then use this representation to define a distance on this class that extends the well-known Robinson-Foulds distance for phylogenetic trees and to give an alignment method for pairs of networks in this class. Simple polynomial algorithms for reconstructing a tree-child phylogenetic network from its path multiplicity vectors, for computing the distance between two tree-child phylogenetic networks and for aligning a pair of tree-child phylogenetic networks, are provided. They have been implemented as a Perl package and a Java applet, which can be found at http://bioinfo.uib.es/~recerca/phylonetworks/mudistance/.
INDEX TERMS
Phylogenetic network, tree-child phylogenetic network, phylogenetic tree, partition distance, network alignment.
CITATION
Gabriel Cardona, Francesc Rosselló, Gabriel Valiente, "Comparison of Tree-Child Phylogenetic Networks", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.6, no. 4, pp. 552-569, October-December 2009, doi:10.1109/TCBB.2007.70270
REFERENCES
 [1] B.L. Allen and M.A. Steel, “Subtree Transfer Operations and Their Induced Metrics on Evolutionary Trees,” Annals of Combinatorics, vol. 5, no. 1, pp. 1-13, 2001. [2] V. Bafna and V. Bansal, “The Number of Recombination Events in a Sample History: Conflict Graph and Lower Bounds,” IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 1, no. 2, pp. 78-90, Apr.-June 2004. [3] H.-J. Bandelt, “Phylogenetic Networks,” Verhandl. Naturwiss. Vereins Hamburg, vol. 34, pp. 51-71, 1994. [4] M. Baroni, C. Semple, and M. Steel, “A Framework for Representing Reticulate Evolution,” Annals of Combinatorics, vol. 8, no. 4, pp. 391-408, 2004. [5] M. Baroni, C. Semple, and M. Steel, “Hybrids in Real Time,” Systematic Biology, vol. 55, no. 1, pp. 46-56, 2006. [6] J. Bluis and D.-G. Shin, “Nodal Distance Algorithm: Calculating a Phylogenetic Tree Comparison Metric,” Proc. Third IEEE Symp. BioInformatics and BioEng. (BIBE '03), pp. 87-94, 2003. [7] M. Bordewich and C. Semple, “Computing the Minimum Number of Hybridization Events for a Consistent Evolutionary History,” Discrete Applied Math., vol. 155, no. 8, pp. 914-928, 2007. [8] G. Cardona, F. Rosselló, and G. Valiente, “A Perl Package and an Alignment Tool for Phylogenetic Networks,” BMC Boinformatics, vol. 9, p. 175, 2008. [9] G. Cardona, F. Rosselló, and G. Valiente, “Tripartitions Do Not Always Discriminate Phylogenetic Networks,” Math. Biosciences, vol. 211, no. 2, pp. 356-370, 2008. [10] D.E. Critchlow, D.K. Pearl, and C. Qian, “The Triples Distance for Rooted Bifurcating Phylogenetic Trees,” Systematic Biology, vol. 45, no. 3, pp. 323-334, 1996. [11] B. DasGupta, X. He, T. Jiang, M. Li, J. Tromp, L. Wang, and L. Zhang, “Computing Distances between Evolutionary Trees,” Handbook of Combinatorial Optimization, D.-Z. Du and P. Pardalos, eds., pp. 35-76, Kluwer Academic Publishers, 1998. [12] D. Gusfield, S. Eddhu, and C. Langley, “The Fine Structure of Galls in Phylogenetic Networks,” INFORMS J. Computing, vol. 16, no. 4, pp. 459-469, 2004. [13] D. Gusfield, S. Eddhu, and C. Langley, “Optimal, Efficient Reconstruction of Phylogenetic Networks with Constrained Recombination,” J. Bioinformatics and Computational Biology, vol. 2, no. 1, pp. 173-213, 2004. [14] D.H. Huson, “Tutorial: Introduction to Phylogenetic Networks,” Proc. German Conf. Bioinformatics (GCB), tutorial, 2006. [15] D.H. Huson, “Split Networks and Reticulate Networks,” Reconstructing Evolution: New Math. and Computational Advances, O. Gascuel and M. Steel, eds., pp. 247-276, Oxford Univ. Press, 2007. [16] D.H. Huson and D. Bryant, “Application of Phylogenetic Networks in Evolutionary Studies,” Molecular Biology and Evolution, vol. 23, no. 2, pp. 254-267, 2006. [17] G. Jin, L. Nakhleh, S. Snir, and T. Tuller, “Maximum Likelihood of Phylogenetic Networks,” Bioinformatics, vol. 22, no. 21, pp. 2604-2611, 2006. [18] G. Jin, L. Nakhleh, S. Snir, and T. Tuller, “Efficient Parsimony-Based Methods for Phylogenetic Network Reconstruction,” Bioinformatics, vol. 23, no. 2, pp. 123-128, 2007. [19] H.W. Kuhn, “The Hungarian Method for the Assignment Problem,” Naval Research Logistics Quarterly, vol. 2, pp. 83-97, 1955. [20] C.R. Linder, B.M.E. Moret, L. Nakhleh, A. Padolina, J. Sun, A. Tholse, R. Timme, and T. Warnow, “An Error Metric for Phylogenetic Networks,” Technical Report TR03-26, Univ. of New Mexico, 2003. [21] C.R. Linder, B.M.E. Moret, L. Nakhleh, and T. Warnow, “Network (Reticulate) Evolution: Biology, Models, and Algorithms,” Proc. Ninth Pacific Symp. Biocomputing (PSB), tutorial, 2003. [22] W.P. Maddison, “Gene Trees in Species Trees,” Systematic Biology, vol. 46, no. 3, pp. 523-536, 1997. [23] B.M.E. Moret, “Computational Challenges from the Tree of Life,” Proc. Seventh Workshop Algorithm Eng. and Experiments and Second Workshop Analytic Algorithmics and Combinatorics, pp. 3-16, 2005. [24] B.M.E. Moret, L. Nakhleh, and T. Warnow, “An Error Metric for Phylogenetic Networks,” Technical Report TR02-09, Univ. of New Mexico, 2002. [25] B.M.E. Moret, L. Nakhleh, T. Warnow, C.R. Linder, A. Tholse, A. Padolina, J. Sun, and R. Timme, “Phylogenetic Networks: Modeling, Reconstructibility, and Accuracy,” IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 1, no. 1, pp. 13-23, Jan.-Mar. 2004. [26] J. Munkres, “Algorithms for the Assignment and Transportation Problems,” J. SIAM, vol. 5, no. 1, pp. 32-38, 1957. [27] T. Munzner, F. Guimbretière, S. Tasiran, L. Zhang, and Y. Zhou, “TreeJuxtaposer: Scalable Tree Comparison Using Focus$+$ Context with Guaranteed Visibility,” ACM Trans. Graphics, vol. 22, no. 3, pp. 453-462, 2003. [28] S.R. Myers and R.C. Griffiths, “Bounds on the Minimum Number of Recombination Events in a Sample History,” Genetics, vol. 163, no. 1, pp. 375-394, 2003. [29] L. Nakhleh, “Phylogenetic Networks,” PhD dissertation, Univ. of Texas, Austin, 2004. [30] L. Nakhleh, A. Clement, T. Warnow, C.R. Linder, and B.M.E. Moret, “Quality Measures for Phylogenetic Networks,” Technical Report TR04-06, Univ. of New Mexico, 2004. [31] L. Nakhleh, J. Sun, T. Warnow, C.R. Linder, B.M.E. Moret, and A. Tholse, “Towards the Development of Computational Tools for Evaluating Phylogenetic Network Reconstruction Methods,” Proc. Eighth Pacific Symp. Biocomputing (PSB '03), pp. 315-326, 2003. [32] L. Nakhleh, T. Warnow, C.R. Linder, and K.S. John, “Reconstructing Reticulate Evolution in Species: Theory and Practice,” J.Computational Biology, vol. 12, no. 6, pp. 796-811, 2005. [33] T.M. Nye, P. Lio, and W.R. Gilks, “A Novel Algorithm and Web-Based Tool for Comparing Two Alternative Phylogenetic Trees,” Bioinformatics, vol. 22, no. 1, pp. 117-119, 2006. [34] R.D.M. Page, “Parallel Phylogenies: Reconstructing the History of Host-Parasite Assemblages,” Cladistics, vol. 10, no. 2, pp. 155-173, 1995. [35] R.D.M. Page, “Phyloinformatics: Toward a Phylogenetic Database,” Data Mining in Bioinformatics, J.T.-L. Wang, M.J. Zaki, H.Toivonen, and D. Shasha, eds., pp. 219-241, Springer, 2005. [36] R.D.M. Page and G. Valiente, “An Edit Script for Taxonomic Classifications,” BMC Bioinformatics, vol. 6, p. 208, 2005. [37] P. Puigbò, S. Garcia-Vallvé, and J.O. McInerney, “TOPD/FMTS: A New Software to Compare Phylogenetic Trees,” Bioinformatics, vol. 23, no. 12, pp. 1556-1558, 2007. [38] D.F. Robinson and L.R. Foulds, “Comparison of Phylogenetic Trees,” Math. Biosciences, vol. 53, no. 1/2, pp. 131-147, 1981. [39] C. Semple, “Hybridization Networks,” Reconstructing Evolution: New Mathematical and Computational Advances, O. Gascuel and M.Steel, eds., Oxford Univ. Press, 2007. [40] Y.S. Song, J. Hein, “Constructing Minimal Ancestral Recombination Graphs,” J. Computational Biology, vol. 12, no. 2, pp. 147-169, 2005. [41] K. Strimmer and V. Moulton, “Likelihood Analysis of Phylogenetic Networks Using Directed Graphical Models,” Molecular Biology and Evolution, vol. 17, no. 6, pp. 875-881, 2000. [42] K. Strimmer, C. Wiuf, and V. Moulton, “Recombination Analysis Using Directed Graphical Models,” Molecular Biology and Evolution, vol. 18, no. 1, pp. 97-99, 2001. [43] L. Wang, K. Zhang, and L. Zhang, “Perfect Phylogenetic Networks with Recombination,” J. Computational Biology, vol. 8, no. 1, pp. 69-78, 2001. [44] M.S. Waterman and T.F. Smith, “On the Similarity of Dendograms,” J. Theoretical Biology, vol. 73, no. 4, pp. 789-800, 1978. [45] S.J. Willson, “Unique Solvability of Certain Hybrid Networks from Their Distances,” Annals of Combinatorics, vol. 10, no. 1, pp.165-178, 2006. [46] S.J. Willson, “Reconstruction of Some Hybrid Phylogenetic Networks with Homoplasies from Distances,” Bull. of Math. Biology, vol. 69, no. 8, pp. 2561-2590, 2007. [47] S.J. Willson, “Restrictions on Meaningful Phylogenetic Networks,” Proc. Workshop Current Challenges and Problems in Phylogenetics, accepted contributed talk, Sept. 2007. [48] S.J. Willson, “Unique Determination of Some Homoplasies at Hybridization Events,” Bull. of Math. Biology, vol. 69, no. 5, pp.1709-1725, 2007.