This Article 
 Bibliographic References 
 Add to: 
An Efficient Method for Exploring the Space of Gene Tree/Species Tree Reconciliations in a Probabilistic Framework
January/February 2012 (vol. 9 no. 1)
pp. 26-39
Jean-Philippe Doyon, Université Montpellier II
Sylvie Hamel, University of Montreal, Montreal
Cedric Chauve, Simon Fraser University, Burnaby
Background. Inferring an evolutionary scenario for a gene family is a fundamental problem with applications both in functional and evolutionary genomics. The gene tree/species tree reconciliation approach has been widely used to address this problem, but mostly in a discrete parsimony framework that aims at minimizing the number of gene duplications and/or gene losses. Recently, a probabilistic approach has been developed, based on the classical birth-and-death process, including efficient algorithms for computing posterior probabilities of reconciliations and orthology prediction. Results. In previous work, we described an algorithm for exploring the whole space of gene tree/species tree reconciliations, that we adapt here to compute efficiently the posterior probability of such reconciliations. These posterior probabilities can be either computed exactly or approximated, depending on the reconciliation space size. We use this algorithm to analyze the probabilistic landscape of the space of reconciliations for a real data set of fungal gene families and several data sets of synthetic gene trees. Conclusion. The results of our simulations suggest that, with exact gene trees obtained by a simple birth-and-death process and realistic gene duplication/loss rates, a very small subset of all reconciliations needs to be explored in order to approximate very closely the posterior probability of the most likely reconciliations. For cases where the posterior probability mass is more evenly dispersed, our method allows to explore efficiently the required subspace of reconciliations.

[1] O. Akerborg, B. Sennblad, L. Arvestad, and J. Lagergren, “Simultaneous Bayesian Gene Tree Reconstruction and Reconciliation Analysis,” Proc. Nat'l Academy of Sciences USA, vol. 106, no. 14, pp. 5714-5719, 2009.
[2] L. Arvestad, J. Lagergren, and B. Sennblad, “The Gene Evolution Model and Computing Its Associated Probabilities,” J. ACM, vol. 56, no. 2, pp. 1-44, 2009.
[3] M.S. Bansal and O. Eulenstein, “An $\omega (n^2/ \log n)$ Speed-up of TBR Heuristics for the Gene-Duplication Problem,” IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 5, no. 4, pp. 514-524, Oct.-Dec. 2008.
[4] M.S. Bansal, O. Eulenstein, and A. Wehe, “The Gene-Duplication Problem: Near-Linear Time Algorithms for NNI Based Local Searches,” IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 6, no. 2, pp. 221-231, Apr.-June 2009.
[5] T. De Bie, N. Cristianini, J.P. Demuth, and M.W. Hahn, “CAFE: A Computational Tool for the Study of Gene Family Evolution,” Bioinformatics, vol. 22, no. 10, pp. 1269-1271, 2006.
[6] P. Bonizzoni, G. Della Vedova, and R. Dondi, “Reconciling a Gene Tree to a Species Tree under the Duplication Cost Model,” Theoretical Computer Science, vol. 347, nos. 1/2, pp. 36-53, 2005.
[7] K.P. Byrne and K.H. Wolfe, “Consistent Patterns of Rate Asymmetry and Gene Loss Indicate Widespread Neofunctionalization of Yeast Genes after Whole-Genome Duplication,” Genetics, vol. 175, no. 3, pp. 1341-1350, 2007.
[8] C. Canestro, J.M. Catchen, A. Rodrguez-Marí, H. Yokoi, and J.H. Postlethwait, “Consequences of Lineage-Specific Gene Loss on Functional Evolution of Surviving Paralogs: Aldh1a and Retinoic Acid Signaling in Vertebrate Genomes,” PLoS Genetics, vol. 5, no. 5, p. e1000496, 2009.
[9] C. Chauve and N. El-Mabrouk, “New Perspectives on Gene Family Evolution: Losses in Reconciliation and a Link with Supertrees,” Proc. 13th Ann. Int'l Conf. Research in Computational Molecular Biology (RECOMB '09), S. Batzoglou, ed., pp. 46-58, 2009.
[10] E.G.J. Danchin, P. Gouret, and P. Pontarotti, “Eleven Ancestral Gene Families Lost in Mammals and Vertebrates while Otherwise Universally Conserved in Animals,” BMC Evolutionary Biology, vol. 6, article 5, pp. 1-10, 2006.
[11] J.P. Demuth and M.W. Hahn, “The Life and Death of Gene Families,” BioEssays News and Rev. in Molecular Cellular and Developmental Biology, vol. 31, no. 1, pp. 29-39, 2009.
[12] J.P. Doyon, C. Chauve, and S. Hamel, “Space of Gene/Species Trees Reconciliations and Parsimonious Models,” J. Computational Biology, vol. 16, no. 10, pp. 1399-1418, Oct. 2009.
[13] B. Dujon, “Yeasts Evolutionary Genomics,” Nature Rev. Genetics, vol. 11, pp. 512-524, 2010.
[14] W.M. Fitch, “Homology: A Personal View on Some of the Problems,” Trends in Genetics, vol. 16, no. 5, pp. 227-231, 2000.
[15] M. Goodman, J. Czelusniak, G.W. Moore, A.E. Romero-Herrera, and G. Matsuda, “Fitting the Gene Lineage into Its Species Lineage, a Parsimony Strategy Illustrated by Cladograms Constructed from Globin Sequences,” Systematic Zoology, vol. 28, pp. 132-163, 1979.
[16] P. Górecki and J. Tiuryn, “DLS-Trees: A Model of Evolutionary Scenarios,” Theoretical Computer Science, vol. 359, no. 1, pp. 378-399, 2006.
[17] A. De Grassi, C. Lanave, and C. Saccone, “Genome Duplication and Gene-Family Evolution: The Case of Three OXPHOS Gene Families,” Gene, vol. 421, nos. 1/2, pp. 1-6, 2008.
[18] M.W. Hahn, “Bias in Phylogenetic Tree Reconciliation Methods: Implications for Vertebrate Genome Evolution,” Genome Biology, vol. 8, article R141, pp. 1-9, 2007.
[19] H. Innan and F. Kondrashov, “The Evolution of Gene Duplications: Classifying and Distinguishing between Models,” Nature Rev. Genetics, vol. 11, pp. 97-108, 2010.
[20] D.G. Kendall, “On the Generalized “Birth-and-Death” Process,” Annals Math. Statistics, vol. 19, pp. 1-15, 1948.
[21] J. Ma, A. Ratan, B.J. Raney, B.B. Suh, L. Zhang, W. Miller, and D. Haussler, “DUPCAR: Reconstructing Contiguous Ancestral Regions with Duplications,” J. Computational Biology, vol. 15, no. 8, pp. 1007-1027, 2008.
[22] R.D. Page, “Maps between Trees and Cladistic Analysis of Historical Associations among Genes, Organisms, and Areas,” Systematic Biology, vol. 43, pp. 58-77, 1994.
[23] M. Sanderson and M. McMahon, “Inferring Angiosperm Phylogeny from EST Data with Widespread Gene Duplication,” BMC Evolutionary Biology, vol. 7, no. Suppl. 1, pp. 1-14, 2007.
[24] B. Sennblad and J. Lagergren, “Probabilistic Orthology Analysis,” Systematic Biology, vol. 58, no. 4, pp. 411-424, 2009.
[25] B. Stroustrup, The C++ Programming Language, Special ed., third ed. Addison-Wesley Professional, Feb. 2000.
[26] J. Thorne, H. Kishino, and I. Painter, “Estimating the Rate of Evolution of the Rate of Molecular Evolution,” Molecular Biology and Evolution, vol. 15, pp. 1647-1657, 1998.
[27] M.J. van Hoeck and P. Hogeweg, “Metabolic Adaptation after Whole Genome Duplication,” Molecular Biology and Evolution, vol. 26, no. 11, pp. 2441-2453, 2009.
[28] I. Wapinski, A. Pfeffer, N. Friedman, and A. Regev, “Natural History and Evolutionary Principles of Gene Duplication in Fungi,” Nature, vol. 449, pp. 54-61, 2007.
[29] K.H. Wolfe and D.C. Shields, “Molecular Evidence for an Ancient Duplication of the Entire Yeast Genome,” Nature, vol. 387, no. 6634, pp. 708-713, 1997.

Index Terms:
Comparative genomics, species tree, gene tree, probability, reconciliation, parsimony.
Jean-Philippe Doyon, Sylvie Hamel, Cedric Chauve, "An Efficient Method for Exploring the Space of Gene Tree/Species Tree Reconciliations in a Probabilistic Framework," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 9, no. 1, pp. 26-39, Jan.-Feb. 2012, doi:10.1109/TCBB.2011.64
Usage of this product signifies your acceptance of the Terms of Use.