CSDL Home IEEE/ACM Transactions on Computational Biology and Bioinformatics 2011 vol.8 Issue No.03 - May/June

Subscribe

Issue No.03 - May/June (2011 vol.8)

pp: 635-649

Katharina T. Huber , University of East Anglia, Norwich

Leo van Iersel , University of Canterbury, Christchurch

Steven Kelk , Centrum voor Wiskunde en Informatica (CWI), Amsterdam

Radosław Suchecki , University of East Anglia, Norwich

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TCBB.2010.17

ABSTRACT

Recently, much attention has been devoted to the construction of phylogenetic networks which generalize phylogenetic trees in order to accommodate complex evolutionary processes. Here, we present an efficient, practical algorithm for reconstructing level-1 phylogenetic networks—a type of network slightly more general than a phylogenetic tree—from triplets. Our algorithm has been made publicly available as the program Lev1athan. It combines ideas from several known theoretical algorithms for phylogenetic tree and network reconstruction with two novel subroutines. Namely, an exponential-time exact and a greedy algorithm both of which are of independent theoretical interest. Most importantly, Lev1athan runs in polynomial time and always constructs a level-1 network. If the data are consistent with a phylogenetic tree, then the algorithm constructs such a tree. Moreover, if the input triplet set is dense and, in addition, is fully consistent with some level-1 network, it will find such a network. The potential of Lev1athan is explored by means of an extensive simulation study and a biological data set. One of our conclusions is that Lev1athan is able to construct networks consistent with a high percentage of input triplets, even when these input triplets are affected by a low to moderate level of noise.

INDEX TERMS

Phylogenetic networks, level-1, triplets, polynomial time.

CITATION

Katharina T. Huber, Leo van Iersel, Steven Kelk, Radosław Suchecki, "A Practical Algorithm for Reconstructing Level-1 Phylogenetic Networks",

*IEEE/ACM Transactions on Computational Biology and Bioinformatics*, vol.8, no. 3, pp. 635-649, May/June 2011, doi:10.1109/TCBB.2010.17REFERENCES

- [1] A.V. Aho, Y. Sagiv, T.G. Szymanski, and J.D. Ullman, "Inferring a Tree from Lowest Common Ancestors with an Application to the Optimization of Relational Expressions,"
SIAM J. Computing, vol. 10, no. 3, pp. 405-421, 1981.- [2]
Phylogenetic Supertrees: Combining Information to Reveal the Tree of Life, O. Bininda-Emonds, ed. Kluwer Academic Publishers, 2004.- [3] D. Bryant, J. Tsang, P.E. Kearney, and M. Li, "Computing the Quartet Distance between Evolutionary Trees,"
Proc. Ann. ACM-SIAM Symp. Discrete Algorithms (SODA), pp. 285-286, 2000.- [4] J. Byrka, P. Gawrychowski, K.T. Huber, and S. Kelk, "Worst-Case Optimal Approximation Algorithms for Maximizing Triplet Consistency within Phylogenetic Networks,"
J. Discrete Algorithms, vol. 8, no. 1, pp. 65-75, 2010.- [5] G. Cardona, M. Llabrés, F. Rosselló, and G. Valiente, "A Distance Metric for a Class of Tree-Sibling Phylogenetic Networks,"
Bioinformatics, vol. 24, no. 13, pp. 1481-1488, 2008.- [6] G. Cardona, M. Llabrés, F. Rosselló, and G. Valiente, "Metrics for Phylogenetic Networks I: Generalizations of the Robinson-Foulds Metric,"
IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 6, no. 1, pp. 46-61, Jan.-Mar. 2009.- [7] G. Cardona, M. Llabrés, F. Rosselló, and G. Valiente, "Metrics for Phylogenetic Networks II: Nodal and Triplets Metrics,"
IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 6, no. 3, pp. 454-469, July-Sept. 2009.- [8] G. Cardona, F. Rosselló, and G. Valiente, "Extended Newick: It Is Time for a Standard Representation of Phylogenetic Networks,"
BMC Bioinformatics, vol. 9, no. 532, Dec. 2008.- [9] G. Cardona, F. Rosselló, and G. Valiente, "Comparison of Tree-Child Phylogenetic Networks,"
IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 6, no. 4, pp. 552-569, Oct.-Dec. 2009.- [10] J. Felsenstein, J. Archie, W.H. Day, W. Maddison, C. Meacham, F.J. Rohlf, and D. Swofford, "The Newick Tree Format," http://evolution.genetics.washington.edu/ phylipnewicktree.html, 1986.
- [11] P. Gambette, "Who's Who in Phylogenetic Networks," http://www.lirmm.fr/gambettePhylogeneticNetworks /, 2009.
- [12] P. Gambette and K.T. Huber, "A Note on Encodings of Phylogenetic Networks of Bounded Level," technical report, arXiv:0906.4324, June 2009.
- [13] E. Gansner, E. Koutsofios, and S. North, "Drawing Graphs with Dot," technical report, AT&T Bell Laboratories, 2006.
- [14] K.T. Huber, L. van Iersel, S. Kelk, and R. Suchecki "LEV1ATHAN: A Level-1 Heuristic" http://homepages.cwi.nl/kelklev1athan/, Sept. 2009.
- [15] D. Huson,, "Split Networks and Reticulate Networks,"
Reconstructing Evolution—New Mathematical and Computational Advances, Oxford Univ. Press, 2007.- [16] D. Huson, D.C. Richter, C. Rausch, M. Franz, and R. Rupp, "Dendroscope: An Interactive Viewer for Large Phylogenetic Trees,"
BMC Bioinformatics, vol. 8, no. 1,article no. 460, 2007.- [17] D. Huson, R. Rupp, and C. Scornavacca,
Phylogenetic Networks. Concepts, Algorithms and Applications. Cambridge Univ. Press, 2011.- [18] J. Jansson, N.B. Nguyen, and W.-K. Sung, "Algorithms for Combining Rooted Triplets into a Galled Phylogenetic Network,"
SIAM J. Computing, vol. 35, no. 5, pp. 1098-1121, 2006.- [19] J. Jansson and W.-K. Sung, "Inferring a Level-1 Phylogenetic Network from a Dense Set of Rooted Triplets,"
Theoretical Computer Science, vol. 363, no. 1, pp. 60-68, 2006.- [20] M. Kimura, "A Simple Method for Estimating Evolutionary Rates of Base Substitutions through Comparative Studies of Nucleotide Sequences,"
J. Molecular Evolution, vol. 16, no. 2, pp. 111-120, June 1980.- [21] M.M. Morin and B.M.E. Moret, "Netgen: Generating Phylogenetic Networks with Diploid Hybrids,"
Bioinformatics, vol. 22, no. 15, pp. 1921-1923, 2006.- [22] M. Nei and S. Kumar,
Molecular Evolution and Phylogenetics. Oxford Univ. Press, 2000.- [23] F. Rosselló and G. Valiente, "All that Glisters is not Galled,"
Math. Biosciences, vol. 221, no. 1, pp. 54-59, 2009.- [24]
The Phylogenetic Handbook: A Practical Approach to DNA and Protein Phylogeny, M. Salemi and A.M. Vandamme, eds. Cambridge Univ. Press, 2003.- [25] H.A. Schmidt, K. Strimmer, M. Vingron, and A. von Haeseler, "Tree-Puzzle: Maximum Likelihood Phylogenetic Analysis Using Quartets and Parallel Computing,"
Bioinformatics, vol. 18, no. 3, pp. 502-504, 2002.- [26] C. Semple, "Hybridization Networks,"
Reconstructing Evolution— New Mathematical and Computational Advances, Oxford Univ. Press, 2007.- [27] C. Semple and M. Steel,
Phylogenetics. Oxford Univ. Press, 2003.- [28] S. Snir and S. Rao, "Using Max Cut to Enhance Rooted Trees Consistency,"
IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 3, no. 4, pp. 323-333, Oct.-Dec. 2006.- [29] T.-H. To and M. Habib, "Level-k Phylogenetic Networks Are Constructable from a Dense Triplet Set in Polynomial Time,"
Proc. Ann. Symp. Combinatorial Pattern Matching (CPM '09), pp. 275-288, 2009.- [30] L.J.J. van Iersel, J.C.M. Keijsper, S.M. Kelk, L. Stougie, F. Hagen, and T. Boekhout, "Constructing Level-2 Phylogenetic Networks from Triplets,"
IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 6, no. 4, pp. 667-681, Oct.-Dec. 2009.- [31] L.J.J. van Iersel and S.M. Kelk, "SIMPLISTIC: Simple Network Heuristic," http://homepages.cwi.nl/kelksimplistic.html , 2008.
- [32] L. van Iersel and S. Kelk, "Constructing the Simplest Possible Phylogenetic Network from Triplets,"
Algorithmica, online, 2009, doi: 10.1007/s00453-009-9333-0. - [33] L.J.J. van Iersel, S.M. Kelk, and M. Mnich, "Uniqueness, Intractability and Exact Algorithms: Reflections on Level-k Phylogenetic Networks,"
J. Bioinformatics and Computational Biology, vol. 7, no. 2, pp. 597-623, 2009.- [34] B.Y. Wu, "Constructing the Maximum Consensus Tree from Rooted Triples,"
J. Combinatorial Optimization, vol. 8, no. 1, pp. 29-39, 2004. |