This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
The Kernel of Maximum Agreement Subtrees
July-Aug. 2012 (vol. 9 no. 4)
pp. 1023-1031
N. D. Pattengale, Sandia Nat. Labs., Albuquerque, NM, USA
E. Chen, Dept. of Biol., Univ. of Ottawa, Ottawa, ON, Canada
K. M. Swenson, Dept. of Math. & Stat., Univ. of Ottawa, Montreal, QC, Canada
D. Sankoff, Dept. of Math. & Stat., Univ. of Ottawa, Ottawa, ON, Canada
A Maximum Agreement SubTree (MAST) is a largest subtree common to a set of trees and serves as a summary of common substructure in the trees. A single MAST can be misleading, however, since there can be an exponential number of MASTs, and two MASTs for the same tree set do not even necessarily share any leaves. In this paper, we introduce the notion of the Kernel Agreement SubTree (KAST), which is the summary of the common substructure in all MASTs, and show that it can be calculated in polynomial time (for trees with bounded degree). Suppose the input trees represent competing hypotheses for a particular phylogeny. We explore the utility of the KAST as a method to discern the common structure of confidence, and as a measure of how confident we are in a given tree set. We also show the trend of the KAST, as compared to other consensus methods, on the set of all trees visited during a Bayesian analysis of flatworm genomes.

[1] K.M. Swenson, E. Chen, N.D. Pattengale, and D. Sankoff, "The Kernel of Maximum Agreement Subtrees," Proc. Seventh Int'l Conf. Bioinformatics Research and Applications (ISBRA '11), pp. 123-135, 2011.
[2] M.T. Holder, J. Sukumaran, and P.O. Lewis, "A Justification for Reporting the Majority-Rule Consensus Tree in Bayesian Phylogenetics," Systematic Biology, vol. 57, no. 5, pp. 814-821, 2008.
[3] E.N. Adams, "Consensus Techniques and the Comparison of Taxonomic Trees," Systematic Zoology, vol. 21, pp. 390-397, 1972.
[4] M. Wilkinson, "Common Cladistic Information and its Consensus Representation: Reduced Adams and Reduced Cladistic Consensus Trees and Profiles," Systematic Biology, vol. 43, no. 3, pp. 343-368, 1994.
[5] M. Barrett, M.J. Donoghue, and E. Sober, "Against Consensus," Systematic Zoology, vol. 40, no. 4, pp. 486-493, 1991.
[6] G. Nelson, "Why Crusade Against Consensus? A Reply to Barret, Donoghue, and Sober," Systematic Biology, vol. 42, no. 2, pp. 215-216, 1993.
[7] M. Barrett, M.J. Donoghue, and E. Sober, "Crusade? A Reply to Nelson," Systematic Biology, vol. 42, no. 2, pp. 216-217, 1993.
[8] C.R. Finden and A.D. Gordon, "Obtaining Common Pruned Trees," J. Classification, vol. 2, no. 1, pp. 255-267, 1985.
[9] E. Kubicka, G. Kubicki, and F. McMorris, "On Agreement Subtrees of Two Binary Trees," Congressus Numeratium, vol. 88, pp. 217-224, 1992.
[10] M. Wilkinson, "More on Reduced Consensus Methods," Systematic Biology, vol. 44, pp. 435-439, 1995.
[11] M. Wilkinson, "Majority-Rule Reduced Consensus Trees and Their Use in Bootstrapping," Moleculer Biology and Evolution, vol. 13, no. 3, pp. 437-444, 1996.
[12] J.L. Thorley, M. Wilkinson, and M. Charleston, "The Information Content of Consensus Trees," Studies in Classification, Data Analysis, and Knowledge Organization, ser. Advances in Data Science and Classification, A. Rizzi, M. Vichi, and H. Bock, eds., pp. 91-98, Springer, 1998.
[13] K.A. Cranston and B. Rannala, "Summarizing a Posterior Distribution of Trees Using Agreement Subtrees," Systematic Biology, vol. 56, no. 4, pp. 578-590, 2007.
[14] N.D. Pattengale, A.J. Aberer, K.M. Swenson, A. Stamatakis, and B.M.E. Moret, "Uncovering Hidden Phylogenetic Consensus in Large Datasets," IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 8, no. 4, pp. 902-911, July/Aug. 2011.
[15] H. Bandelt and A. Dress, "Split Decomposition: A New and Useful Approach to Phylogenetic Analysis of Distance Data," Moleculer Phylogenetics and Evolution, vol. 1, no. 3, pp. 242-252, 1992.
[16] D.H. Huson, "Splitstree: Analyzing and Visualizing Evolutionary Data," Bioinformatics, vol. 14, no. 1, pp. 68-73, 1998.
[17] D. Bryant and V. Moulton, "Neighbor-Net: An Agglomerative Method for the Construction of Phylogenetic Networks," Moleculer Biology and Evolution, vol. 21, no. 2, pp. 255-265, 2004.
[18] O. Gauthier and F.-J. Lapointe, "Seeing the Trees for the Network: Consensus, Information Content, and Superphylogenies," Systematic Biology, vol. 56, no. 2, pp. 345-355, 2007.
[19] B. Redelings, "Bayesian Phylogenies Unplugged: Majority Consensus Trees with Wandering Taxa," http://www.duke.edu/br51wandering.pdf, 2012.
[20] D. Bryant, "A Classification of Consensus Methods for Phylogenetics," Bioconsensus, ser. DIMACS Series in Discrete Math. and Theoretical Computer Science, vol. 61, pp. 163-184, AMS Press, 2002.
[21] P. Bonizzoni, G.D. Vedova, R. Dondi, and G. Mauri, "The Comparison of Phylogenetic Networks: Algorithms and Complexity," Bioinformatics Algorithms: Techniques and Applications, Wiley Interscience, pp. 143-173, 2008.
[22] K. Shin and T. Kuboyama, "Kernels Based on Distributions of Agreement Subtrees," AI 2008: Advances in Artificial Intelligence, W. Wobcke and M. Zhang, eds., vol. 5360, pp. 236-246, Springer, 2008.
[23] M. Farach, T.M. Przytycka, and M. Thorup, "On the Agreement of Many Trees," Information Processing Letters, vol. 55, no. 6, pp. 297-301, 1995.
[24] D. Bryant, "Building Trees, Hunting for Trees, and Comparing Trees," PhD dissertation, Dept. of Math., Univ. of Canterbury, 1997.
[25] A. Stamatakis, "RAxML-VI-HPC: Maximum Likelihood-Based Phylogenetic Analyses with Thousands of Taxa and Mixed Models," Bioinformatics, vol. 22, no. 21, pp. 2688-2690, 2006.
[26] D.F. Robinson, "Comparison of Labeled Trees with Valency Three," J. Combinatorial Theory, vol. 11, no. 2, pp. 105-119, 1971.
[27] G.W. Moore, M. Goodman, and J. Barnabas, "An Iterative Approach from the Standpoint of the Additive Hypothesis to the Dendrogram Problem Posed by Molecular Data Sets," J. Theoretical Biology, vol. 38, no. 3, pp. 423-457, 1973.
[28] H. Philippe, H. Brinkmann, R.R. Copley, L.L. Moroz, H. Nakano, A.J. Poustka, A. Wallberg, K.J. Peterson, and M.J. Telford, "Acoelomorph Flatworms are Deuterostomes Related to Xenoturbella," Nature, vol. 470, no. 7333, pp. 255-258, Feb. 2011.
[29] N. Lartillot and H. Philippe, "A Bayesian Mixture Model for Across-Site Heterogeneities in the Amino-Acid Replacement Process," Moleculer and Biology Evolution, vol. 21, no. 6, pp. 1095-1109, June 2004.
[30] N. Lartillot, H. Brinkmann, and H. Philippe, "Suppression of Long-Branch Attraction Artefacts in the Animal Phylogeny Using a Site-Heterogeneous Model," BMC Evolution Biology, vol. 7, Suppl. 1, Mar. 2006.
[31] J. Felsenstein, Phylogenetic Inference Package (PHYLIP), Version 3.5, Univ. of Washington, 1993.
[32] J.T. Herbeck, P.H. Degnan, and J.J. Wernegreen, "Nonhomogeneous Model of Sequence Evolution Indicates Independent Origins of Primary Endosymbionts Within the Enterobacteriales ($gamma$ -Proteobacteria)," Moleculer and Biology Evolution, vol. 22, no. 3, pp. 520-532, 2005.
[33] E. Lerat, V. Daubin, and N.A. Moran, "From Gene Trees to Organismal Phylogeny in Prokaryotes: The Case of the $\gamma$ -Proteobacteria," PLoS Biology, vol. 1, no. 1, p. e19, 2003.
[34] J. Earnest-DeYoung, E. Lerat, and B. Moret, "Reversing Gene Erosion: Reconstructing Ancestral Bacterial Genomes from Gene-Content and Gene-Order Data," Proc. Fourth Int'l Workshop Algorithms in Bioinformatics (WABI '04), pp. 1-13, 2004.
[35] G. Blin, C. Chauve, and G. Fertin, "Genes Order and Phylogenetic Reconstruction: Application to $\gamma$ -Proteobacteria," Proc. Int'l Conf. Comparative Genomics (RCG '05), pp. 11-20, 2004.
[36] E. Belda, A. Moya, and F. Silva, "Genome Rearrangement Distances and Gene Order Phylogeny in $\gamma$ -Proteobacteria," Moleculer Biology and Evolution, vol. 22, no. 6, pp. 1456-1467, 2005.
[37] K. Swenson, W. Arndt, J. Tang, and B. Moret, "Phylogenetic Reconstruction from Complete Gene Orders of Whole Genomes," Proc. Sixth Asia Pacific Bioinformatics Conf. (APBC '08), pp. 241-250, 2008.

Index Terms:
zoology,Bayes methods,genetics,genomics,trees (mathematics),flatworm genomes,maximum agreement subtrees,MAST,tree substructure,leaves,Kernel Agreement SubTree,KAST,polynomial time,phylogeny,Bayesian analysis,Vegetation,Phylogeny,Kernel,Bioinformatics,Computational biology,Heuristic algorithms,MAST.,Phylogenetics,consensus tree,agreement subtree
Citation:
N. D. Pattengale, E. Chen, K. M. Swenson, D. Sankoff, "The Kernel of Maximum Agreement Subtrees," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 9, no. 4, pp. 1023-1031, July-Aug. 2012, doi:10.1109/TCBB.2012.11
Usage of this product signifies your acceptance of the Terms of Use.