The Community for Technology Leaders
Subscribe
Issue No.01 - January-March (2009 vol.6)
pp: 62-75
Stephen J. Willson , Iowa State University, Ames
ABSTRACT
Given a collection of rooted phylogenetic trees with overlapping sets of leaves, a compatible supertree $S$ is a single tree whose set of leaves is the union of the input sets of leaves and such that $S$ agrees with each input tree when restricted to the leaves of the input tree. Typically with trees from real data, no compatible supertree exists, and various methods may be utilized to reconcile the incompatibilities in the input trees. This paper focuses on a measure of robustness of a supertree method called its radius" $R$. The larger the value of $R$ is, the further the data set can be from a natural correct tree $T$ and yet the method will still output $T$. It is shown that the maximal possible radius for a method is $R = 1/2$. Many familiar methods, both for supertrees and consensus trees, are shown to have $R = 0$, indicating that they need not output a tree $T$ that would seem to be the natural correct answer. A polynomial-time method Normalized Triplet Supertree (NTS) with the maximal possible $R = 1/2$ is defined. A geometric interpretion is given, and NTS is shown to solve an optimization problem. Additional properties of NTS are described.
INDEX TERMS
Trees, Graph algorithms, phylogenetic tree, supertree
CITATION
Stephen J. Willson, "Robustness of Topological Supertree Methods for Reconciling Dense Incompatible Data", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.6, no. 1, pp. 62-75, January-March 2009, doi:10.1109/TCBB.2008.51
REFERENCES
 [1] E.N. AdamsIII, “N-Trees as Nestings: Complexity, Similarity and Consensus,” J. Classification, vol. 3, pp. 299-317, 1986. [2] A.V. Aho, Y. Sagiv, T.G. Szymanski, and J.D. Ullman, “Inferring a Tree from Lowest Common Ancestors with an Application to the Optimization of Relational Expressions,” SIAM J. Computing, vol. 10, no. 3, pp. 405-421, 1981. [3] K. Atteson, “The Performance of Neighbor-Joining Methods of Phylogenetic Reconstruction,” Algorithmica, vol. 25, pp. 251-278, 1999. [4] J.-P. Barthelemy and A. Guenoche, Trees and Proximity Representations. John Wiley & Sons, 1991. [5] B.R. Baum, “Combining Trees as a Way of Combining Data Sets for Phylogenetic Inference, and the Desirability of Combining Gene Trees,” Taxon, vol. 41, pp. 3-10, 1992. [6] V. Berry and O. Gascuel, “Inferring Evolutionary Trees with Strong Combinatorial Evidence,” Theoretical Computer Science, vol. 240, pp. 271-298, 2000. [7] V. Berry, T. Jiang, P. Kearney, M. Li, and T. Wareham, “Quartet Cleaning: Improved Algorithms and Simulations,” Proc. Seventh Ann. European Symp. Algorithms (ESA '99), J. Nesetril, ed., 1999. [8] O. Bininda-Emonds and M.J. Sanderson, “Assessment of the Accuracy of Matrix Representation with Parsimony Analysis Supertree Construction,” Systematic Biology, vol. 50, no. 4, pp.565-579, 2001. [9] O. Bininda-Emonds, J.L. Gittleman, and M. Steel, “The (Super)Tree of Life: Procedures, Problems, and Prospects,” Ann. Rev. Ecology and Systematics, vol. 33, pp. 265-289, 2002. [10] Phylogenetic Supertrees: Combining Information to Reveal the Tree of Life, O.R.P. Bininda-Emonds, ed. Kluwer Academic Publishers, 2004. [11] O.R.P. Bininda-Emonds, M. Cardillo, K.E. Jones, R.D.D. MacPhee, R.M.D. Beck, R. Grenyer, S.A. Price, R.A. Vos, J.L. Gittleman, and A. Purvis, “The Delayed Rise of Present-Day Mammals,” Nature, vol. 446, pp. 507-512, 2007. [12] C.J. Creevey, D.A. Fitzpatrick, G.K. Philip, R.J. Kinsella, M.J. O'Connell, M.M. Pentony, S.A. Travers, M. Wilkinson, and J.O. McInerney, “Does a Tree-Like Phylogeny Only Exist at the Tips in the Prokaryotes?” Proc. Royal Soc. of London B. Biological Sciences, vol. 271, pp. 2551-2558, 2004. [13] C.J. Creevey and J.O. McInerney, “CLANN: Software for Supertree Analysis,” Bioinformatics, vol. 21, pp. 390-392, 2005. [14] J.H. Degnan and N.A. Rosenberg, “Discordance of Species Trees with Their Most Likely Gene Trees,” PLoS Genetics, vol. 2, no. 5, 2006, e68, DOI: 10.1371/journal.pgen.0020068. [15] P.L. Erdös, M.A. Steel, L.A. Székely, and T.J. Warnow, “Constructing Big Trees from Short Sequences,” Proc. 24th Int'l Colloquium in Automata, Languages and Programming (ICALP '97), P. Degano, R. Gorrieri, and A. Marchetti-Spaccamela, eds., pp.827-837, 1997. [16] P.L. Erdös, M.A. Steel, L.A. Székely, and T.J. Warnow, “A Few Logs Suffice to Build (Almost) All Trees (I),” Random Structures and Algorithms, vol. 14, pp. 153-184, 1999. [17] J. Felsenstein, Inferring Phylogenies. Sinauer Assoc., Inc., 2004. [18] L.R. Foulds and R.L. Graham, “The Steiner Problem in PhylogenyIs NP-Complete,” Advances in Applied Math., vol. 3, pp. 43-49, 1982. [19] M. Gondran, “La Structure Algébrique des Classifications Hiérarchiques,” Annales de l'INSEE, vols. 22-23, pp. 181-190, 1976. [20] M. Gondran and M. Mindux, Graphs and Algorithms (translated by S. Vajda). John Wiley & Sons, 1984. [21] T. Jiang, P.E. Kearney, and M. Li, “Orchestrating Quartets: Approximation and Data Corrections,” Proc. 39th IEEE Symp. Foundations of Computer Science (FOCS '98), pp. 416-425, 1998. [22] R.D.M. Page, “Modified Mincut Supertrees,” Proc. Second Int'l Workshop Algorithms in Bioinformatics (WABI '02), R. Guig and D.Gusfield, eds., pp. 537-552, Sept. 2002. [23] G.K. Philip, C.J. Creevey, and J.O. McInerney, “The Opisthokonta and the Ecdysozoa May Not Be Clades: Stronger Support for the Grouping of Plant and Animal than for Animal and Fungi and Stronger Support for the Coelomata than Ecdysozoa,” Molecular Biology and Evolution, vol. 22, no. 5, pp. 1175-1184, 2005. [24] M.A. Ragan, “Phylogenetic Inference Based on Matrix Representation of Trees,” Molecular Phylogenetics and Evolution, vol. 1, pp.53-58, 1992. [25] V. Ranwez, V. Berry, A. Criscuolo, P.-H. Fabre, S. Guillemot, C. Scornavacca, and E. Douzery, “PhySIC: A Veto Supertree Method with Desirable Properties,” Systematic Biology, vol. 56, no. 5, pp. 798-817, 2007. [26] M.J. Sanderson, A. Purvis, and C. Henze, “Phylogenetic Supertrees: Assembling the Trees of Life,” Trends in Ecology and Evolution, vol. 13, pp. 105-109, 1998. [27] C. Semple and M. Steel, “A Supertree Method for Rooted Trees,” Discrete Applied Math., vol. 105, pp. 147-158, 2000. [28] C. Semple and M. Steel, Phylogenetics. Oxford Univ. Press, 2003. [29] S. Snir and S. Rao, “Using Max Cut to Enhance Rooted Trees Consistency,” IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 3, no. 4, pp. 323-333, Oct.-Dec. 2006. [30] M. Steel, S. Böcker, and A.W.M. Dress, “Simple but Fundamental Limitations on Supertree and Consensus Tree Methods,” Systematic Biology, vol. 49, no. 2, pp. 363-368, 2000. [31] D.L. Swofford, PAUP*, Phylogenetic Analysis Using Parsimony (*and Other Methods), Version 4. Sinauer Assoc., 2002. [32] M. Wilkinson, J.L. Thorley, D. Pisani, F.-J. Lapointe, and J.O. McInerney, “Some Desiderata for Liberal Supertrees,” Phylogenetic Supertrees: Combining Information to Reveal the Tree of Life, O.R.P. Bininda-Emonds, ed., pp. 227-246, Kluwer Academic Publishers, 2004.