This Article 
 Bibliographic References 
 Add to: 
Fast Local Search for Unrooted Robinson-Foulds Supertrees
July-Aug. 2012 (vol. 9 no. 4)
pp. 1004-1013
J. G. Burleigh, Dept. of Biol., Univ. of Florida, Gainesville, FL, USA
R. Chaudhary, Dept. of Comput. Sci., Iowa State Univ., Ames, IA, USA
D. Fernandez-Baca, Dept. of Comput. Sci., Iowa State Univ., Ames, IA, USA
A Robinson-Foulds (RF) supertree for a collection of input trees is a tree containing all the species in the input trees that is at minimum total RF distance to the input trees. Thus, an RF supertree is consistent with the maximum number of splits in the input trees. Constructing RF supertrees for rooted and unrooted data is NP-hard. Nevertheless, effective local search heuristics have been developed for the restricted case where the input trees and the supertree are rooted. We describe new heuristics, based on the Edge Contract and Refine (ECR) operation, that remove this restriction, thereby expanding the utility of RF supertrees. Our experimental results on simulated and empirical data sets show that our unrooted local search algorithms yield better supertrees than those obtained from MRP and rooted RF heuristics in terms of total RF distance to the input trees and, for simulated data, in terms of RF distance to the true tree.

[1] O.R.P. Bininda-Emonds, M. Cardillo, K.E. Jones, R.D.E. MacPhee, R.M.D. Beck, R. Grenyer, S.A. Price, R.A. Vos, J.L. Gittleman, and A. Purvis, "The Delayed Rise of Present-Day Mammals," Nature, vol. 446, pp. 507-512, 2007.
[2] T.J. Davies, T.G. Barraclough, M.W. Chase, P.S. Soltis, D.E. Soltis, and V. Savolainen, "Darwin's Abominable Mystery: Insights from a Supertree of the Angiosperms," Proc. Nat'l Academy of Sciences USA, vol. 101, pp. 1904-1909, 2004.
[3] D. Pisani, A.M. Yates, M.C. Langer, and M.J. Benton, "A Genus-Level Supertree of the Dinosauria," Proc. Royal Soc. of London, vol. 269, pp. 915-921, 2002.
[4] B.R. Baum, "Combining Trees as a Way of Combining Data Sets for Phylogenetic Inference, and the Desirability of Combining Gene Trees," Taxon, vol. 41, pp. 3-10, 1992.
[5] M.A. Ragan, "Phylogenetic Inference Based on Matrix Representation of Trees," Molecular Phylogenetics and Evolution, vol. 1, pp. 53-58, 1992.
[6] O.R.P. Bininda-Emonds and M.J. Sanderson, "Assessment of the Accuracy of Matrix Representation with Parsimony Analysis Supertree Construction," Systematic Biology, vol. 50, pp. 565-579, 2001.
[7] D. Chen, O. Eulenstein, D. Fernández-Baca, and J.G. Burleigh, "Improved Heuristics for Minimum-Flip Supertree Construction," Evolutionary Bioinformatics, vol. 2, pp. 347-356, 2006.
[8] O. Eulenstein, D. Chen, J.G. Burleigh, D. Fernández-Baca, and M.J. Sanderson, "Performance of Flip Supertree Construction with a Heuristic Algorithm," Systematic Biology, vol. 53, pp. 299-308, 2004.
[9] P.A. Goloboff, "Minority Rule Supertrees? MRP, Compatibility, and Minimum Flip May Display the Least Frequent Groups," Cladistics, vol. 21, pp. 282-294, 2005.
[10] A. Purvis, "A Modification to Baum and Ragan's Method for Combining Phylogenetic Trees," Systematic Biology, vol. 44, pp. 251-255, 1995.
[11] D. Pisani and M. Wilkinson, "Matrix Representation with Parsimony, Taxonomic Congruence and Total Evidence," Systematic Biology, vol. 51, pp. 151-155, 2002.
[12] O.R.P. Bininda-Emonds, R.M.D. Beck, and A. Purvis, "Getting to the Roots of Matrix Representation," Systematic Biology, vol. 54, pp. 668-672, 2005.
[13] M.S. Bansal, J.G. Burleigh, O. Eulenstein, and D. Fernández-Baca, "Robinson-Foulds Supertrees," Algorithms for Molecular Biology, vol. 5, p. 18, 2010.
[14] B.L. Allen and M. Steel, "Subtree Transfer Operations and Their Induced Metrics on Evolutionary Trees," Annals of Combinatorics, vol. 5, pp. 1-15, 2001.
[15] M. Bordewich and C. Semple, "On the Computational Complexity of the Rooted Subtree Prune and Regraft Distance," Annals of Combinatorics, vol. 8, pp. 409-423, 2004.
[16] G. Ganapathy, V. Ramachandran, and T. Warnow, "Better Hill-Climbing Searches for Parsimony," Proc. Third Int'l Workshop Algorithms in Bioinformatics (WABI '03), pp. 245-258, 2003.
[17] G. Ganapathy, V. Ramachandran, and T. Warnow, "On Contract-and-Refine Transformations between Phylogenetic Trees," Proc. Fifteenth ACM-SIAM Symp. Discrete Algorithms (SODA '04), pp. 900-909, 2004.
[18] P.A. Goloboff, "Analyzing Large Data Sets in Reasonable Times: Solutions for Composite Optima," Cladistics, vol. 15, pp. 415-428, 1999.
[19] C.J. Creevey and J.O. McInerney, "Clann: Investigating Phylogenetic Information through Supertree Analyses," Bioinformatics, vol. 21, no. 3, pp. 390-392, 2005.
[20] A.B. Smith, "Rooting Molecular Trees: Problems and Strategies," Biological J. Linnean Soc., vol. 51, pp. 279-292, 1994.
[21] W.C. Wheeler, "Nucleic Acid Sequence Phylogeny and Random Outgroups," Cladistics, vol. 6, pp. 363-367, 1990.
[22] M.J. Sanderson and H.B. Shaffer, "Troubleshooting Molecular Phylogenetic Analyses," Ann. Rev. Ecology and Systematics, vol. 33, pp. 49-72, 2002.
[23] J. Leebens-Mack, L.A. Raubeson, L. Cui, J.V. Kuehl, M.H. Fourcade, T.W. Chumley, J.L. Boore, R.K. Jansen, and C.W. dePamphilis, "Identifying the Basal Angiosperm Node in Chloroplast Genome Phylogenies: Sampling One's Way Out of the Felsenstein Zone," Moleculer Biology and Evolution, vol. 22, pp. 1948-1963, 2005.
[24] B. Holland, D. Penny, and M. Hendy, "Outgroup Misplacement and Phylogenetic Inaccuracy under a Molecular Clock—A Simulation Study," Systematic Biology, vol. 52, pp. 229-238, 2003.
[25] J.P. Huelsenbeck, J.P. Bollback, and A.M. Levine, "Inferring the Root of a Phylogenetic Tree," Systematic Biology, vol. 51, pp. 32-43, 2002.
[26] V.B. Yap and T. Speed, "Rooting a Phylogenetic Tree with Nonreversible Substitution Models," BMC Evolutionary Biology, vol. 5, article 2, 2005.
[27] C. Semple and M. Steel, Phylogenetics. Oxford Univ. Press, 2003.
[28] D.F. Robinson and L.R. Foulds, "Comparison of Phylogenetic Trees," Math. Biosciences, vol. 53, pp. 131-147, 1981.
[29] F.R. McMorris and M.A. Steel, "The Complexity of the Median Procedure for Binary Trees," Proc. Int'l Federation of Classification Societies, 1993.
[30] M.A. Bender and M. Farach-Colton, "The LCA Problem Revisited," Proc. Latin Am. Theoretical INformatics (LATIN), pp. 88-94, 2000.
[31] K.C. Nixon, "The Parsimony Ratchet: A New Method for Rapid Parsimony Analysis," Cladistics, vol. 15, pp. 407-414, 1999.
[32] M.S. Swenson, F. Barbançon, T. Warnow, and C.R. Linder, "A Simulation Study Comparing Supertree and Combined Analysis Methods Using SMIDGen," Algorithms for Molecular Biology, vol. 5, p. 8, 2010.
[33] M.S. Swenson, R. Suri, C.R. Linder, and T. Warnow, "An Experimental Study of Quartets Maxcut and Other Supertree Methods," Algorithms for Molecular Biology, vol. 6, p. 7, 2011.
[34] M. Cardillo, O.R.P. Bininda-Emonds, E. Boakes, and A. Purvis, "A Species-Level Phylogenetic Supertree of Marsupials," J. Zoology, vol. 264, pp. 11-31, 2004.
[35] R.M.D. Beck, O.R.P. Bininda-Emonds, M. Cardillo, F.R. Liu, and A. Purvis, "A Higher-Level MRP Supertree of Placental Mammals," BMC Evolutionary Biology, vol. 6, article 93, 2006.
[36] G.T. Lloyd, K.E. Davis, D. Pisani, J.E. Tarver, M. Ruta, M. Sakamoto, D.W.E. Hone, R. Jennings, and M.J. Benton, "Dinosaurs and the Cretaceous Terrestrial Revolution," Proc. Royal Society B, vol. 275, pp. 2483-2490, 2008.
[37] S.F. Altschul, W. Gish, W. Miller, E.W. Myers, and D.J. Lipman, "Basic Local Alignment Search Tool," J. Molecular Biology, vol. 215, pp. 403-410, 1990.
[38] R.C. Edgar, "MUSCLE: Multiple Sequence Alignment with High Accuracy and High Throughput," Nucleic Acids Research, vol. 32, pp. 1792-1797, 2004.
[39] A. Stamatakis, "RAxML-VI-HPC: Maximum Likelihood- Based Phylogenetic Analyses with Thousands of Taxa and Mixed Models," Bioinformatics, vol. 22, pp. 2688-2690, 2006.
[40] A. Wehe and J.G. Burleigh, "Scaling the Gene Duplication Problem Towards the Tree of Life," BICoB, H. Al-Mubaid, ed., pp. 133-138, ISCA, 2010.
[41] D.L. Swofford, "PAUP∗: Phylogenetic Analysis Using Parsimony (∗and other Methods), Version 4.0," 2003.
[42] M.J. Sanderson, "r8s: Inferring Absolute Rates of Molecular Evolution and Divergence Times in the Absence of a Molecular Clock," Bioinformatics, vol. 19, pp. 301-302, 2003.

Index Terms:
trees (mathematics),computational complexity,genetics,optimisation,tree searching,unrooted local search algorithms,fast local search,unrooted Robinson-Foulds supertrees,NP-hard unrooted data,local search heuristics,edge contract and refine operation,empirical data sets,Radio frequency,Vegetation,Phylogeny,Search problems,Materials requirements planning,Bioinformatics,Computational biology,NNI.,Computational phylogenetics,Robinson-Foulds,supertrees,local search,2-ECR
J. G. Burleigh, R. Chaudhary, D. Fernandez-Baca, "Fast Local Search for Unrooted Robinson-Foulds Supertrees," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 9, no. 4, pp. 1004-1013, July-Aug. 2012, doi:10.1109/TCBB.2012.47
Usage of this product signifies your acceptance of the Terms of Use.