The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.06 - November/December (2011 vol.8)
pp: 1685-1691
Louxin Zhang , National University of Singapore, Singapore
ABSTRACT
When gene copies are sampled from various species, the resulting gene tree might disagree with the containing species tree. The primary causes of gene tree and species tree discord include incomplete lineage sorting, horizontal gene transfer, and gene duplication and loss. Each of these events yields a different parsimony criterion for inferring the (containing) species tree from gene trees. With incomplete lineage sorting, species tree inference is to find the tree minimizing extra gene lineages that had to coexist along species lineages; with gene duplication, it becomes to find the tree minimizing gene duplications and/or losses. In this paper, we present the following results: 1) The deep coalescence cost is equal to the number of gene losses minus two times the gene duplication cost in the reconciliation of a uniquely leaf labeled gene tree and a species tree. The deep coalescence cost can be computed in linear time for any arbitrary gene tree and species tree. 2) The deep coalescence cost is always not less than the gene duplication cost in the reconciliation of an arbitrary gene tree and a species tree. 3) Species tree inference by minimizing deep coalescence events is NP-hard.
INDEX TERMS
Gene tree and species tree reconciliation, deep coalescence, gene duplication and loss, the parsimony principle, NP-hardness.
CITATION
Louxin Zhang, "From Gene Trees to Species Trees II: Species Tree Inference by Minimizing Deep Coalescence Events", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.8, no. 6, pp. 1685-1691, November/December 2011, doi:10.1109/TCBB.2011.83
REFERENCES
[1] M.S. Bansal and O. Eulenstein, “The Multiple Gene Duplication Problem Revisited,” Bioinformatics, vol. 24, pp. 132-138, 2008.
[2] C. Chauve, J.P. Doyon, and N. El-Mabrouk, “Gene Family Evolution by Duplication, Speciation, and Loss,” J. Computational Biology, vol. 15, pp. 1043-1062, 2008.
[3] K. Chen, D. Durand, and M. Farach-Colton, “Notung: A Program for Dating Gene Duplications and Optimizing Gene Family Trees,” J. Computational Biology, vol. 7, pp. 429-447, 2000.
[4] J.H. Degnan and L.A. Salter, “Gene Tree Distribution under the Coalescence Process,” Evolution, vol. 59, pp. 24-37, 2005.
[5] J.J. Doyle, “Gene Trees and Species Trees: Molecular Systematics as One-Character Taxonomy,” Systematic Botany, vol. 17, pp. 144-163, 1992.
[6] D. Durand, B.V. Halldorsson, and B. Vernot, “A Hybrid Micro-Macroevolutionary Approach to Gene Tree Reconstruction,” J. Computational Biology, vol. 13, pp. 320-335, 2006.
[7] S.V. Edwards and P. Beerli, “Perspective: Gene Divergence, Population Divergence, and the Variance in Coalescence Time in Phylogeography Studies,” Evolution, vol. 54, pp. 1839-1854, 2000.
[8] O. Eulenstein, B. Mirkin, and M. Vingron, “Duplication-Based Measures of Difference between Gene and Species Trees,” J. Computational Biology, vol. 5, pp. 135-148, 1998.
[9] W. Fitch, “Distinguishing Homologous from Analogous Proteins,” Systematic Zoology, vol. 19, pp. 99-113, 1970.
[10] M. Garey and D. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman, 1979.
[11] M. Goodman, J. Czelusniak, G.W. Moore, A.E. Romero-Herrera, and G. Matsuda, “Fitting the Gene Lineage into Its Species Lineage, a Parsimony Strategy Illustrated by Cladograms Constructed from Globin Sequences,” Systematic Zoology, vol. 28, pp. 132-163, 1979.
[12] R. Guigó, I. Muchnik, and T. Smith, “Reconstruction of Ancient Molecular Phylogeny,” Molecular Phylogenetics and Evolution, vol. 6, pp. 189-213, 1996.
[13] M.T. Hallett and J. Lagergren, “New Algorithms for the Duplication-Loss Model,” RECOMB '00: Proc. Fourth Ann. Int'l Conf. Computational Molecular Biology, pp. 138-146, 2000.
[14] J. Hey and R. Nielsen, “Multilocus Methods for Estimating Population Sizes, Migration Rates and Divergence Time, with Applications to the Divergence of Drosophila pseudoobscura and D. persimilis,” Genetics, vol. 167, pp. 747-760, 2004.
[15] R. Libeskind-Hadas and M.A. Charleston, “On the Computational Complexity of the Reticulate Cophylogeny Reconstruction Problem,” J. Computational Biology, vol. 16, pp. 105-117, 2009.
[16] L. Liu, L.L. Yu, L. Kubatko, D.K. Pearl, and S.V. Edwards, “Coalescent Methods for Estimating Phylogenetic Trees,” Molecular Phylogenetics and Evolution, vol. 53, pp. 320-328, 2009.
[17] C.W. Luo, M.C. Chen, Y.C. Chen, W.L. Yang, H.F. Liu, and K.-M. Chao, “Linear-Time Algorithms for the Multiple Gene Duplication Problems,” IEEE Trans. Computational Biology and Bioinformatics, vol. 8, no. 1, pp. 260-265, Jan./Feb. 2011.
[18] B. Ma, M. Li, and L.X. Zhang, “From Gene Trees to Species Trees,” SIAM J. Computing, vol. 30, pp. 729-752, 2001.
[19] W.P. Maddison, “Gene Trees in Species Trees,” Systematic Biology, vol. 46, pp. 523-536, 1997.
[20] W.P. Maddison and L. Knowles, “Inferring Phylogeny despite Incomplete Lineage Sorting,” Systematic Biology, vol. 55, pp. 21-30, 2006.
[21] B. Mirkin, I. Muchnik, and T. Smith, “A Biologically Meaningful Model for Comparing Molecular Phylogenies,” J. Computational Biology, vol. 2, pp. 493-507, 1995.
[22] M.M. Miyamoto and W.T. Fitch, “Testing Species Phylogenies and Phylogenetic Methods with Congruence,” Systematic Biology, vol. 44, pp. 64-76, 1995.
[23] M. Nei, Molecular Evolutionary Genetics. Columbia Univ. Press, 1987.
[24] R. Page, “Maps between Trees and Cladistic Analysis of Historical Associations among Genes, Organisms, and Areas,” Systematic Biology, vol. 43, pp. 58-77, 1994.
[25] R. Page and M. Charleston, “From Gene to Organismal Phylogeny: Reconciled Trees and the Gene Tree/Species Tree Problem,” Molecular Phylogenetics and Evolution, vol. 7, pp. 231-240, 1997.
[26] P. Pamilo and M. Nei, “Relationship between Gene Trees and Species Trees,” Molecular Biology Evolution, vol. 5, pp. 568-583, 1988.
[27] F. Ronquist, “Phylogenetic Approaches in Coevolution and Biogeography,” Zoologica Scripta, vol. 26, pp. 313-322, 1997.
[28] N.A. Rosenberg, “The Probability of Topological Concordance of Gene Trees and Species Trees,” Theoretical Population Biology, vol. 61, pp. 225-247, 2002.
[29] C. Roth, A. Rastogi, L. Arvestad, K. Dittmar, S. Light, D. Ekman, A. David, and D.A. Liberles, “Evolution After Gene Duplication: Models, Mechanisms, Sequences, Systems, and Organisms,” J. Experimental Zoology Part B, vol. 308, pp. 58-73, 2007.
[30] N. Takahata, “Gene Genealogy in Three Related Population: Consistency Probability between Gene and Population Trees,” Genetics, vol. 122, pp. 957-966, 1989.
[31] C. Than and L. Nakhleh, “Species Tree Inference by Minimizing Deep Coalescences,” PLoS Computational Biology, vol. 5, e1000501, 2009, doi:10.1371/journal.pcbi.1000501.
[32] C.-I. Wu, “Inference of Species Phylogeny in Relation to Segregation of Ancient Polymorphisms,” Genetics, vol. 127, pp. 429-435, 1991.
[33] L.X. Zhang, “On a Mirkin-Muchnik-Smith Conjecture for Comparing Molecular Phylogenies,” J. Computational Biology, vol. 4, pp. 177-188, 1997.
16 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool