Issue No. 04 - July-Aug. (2013 vol. 10)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.103
Mukul S. Bansal , Comput. Sci. & Artificial Intell. Lab., Massachusetts Inst. of Technol., Cambridge, MA, USA
Oliver Eulenstein , Dept. of Comput. Sci., Iowa State Univ., Ames, IA, USA
The use of genomic data sets for phylogenetics is complicated by the fact that evolutionary processes such as gene duplication and loss, or incomplete lineage sorting (deep coalescence) cause incongruence among gene trees. One well-known approach that deals with this complication is gene tree parsimony, which, given a collection of gene trees, seeks a species tree that requires the smallest number of evolutionary events to explain the incongruence of the gene trees. However, a lack of efficient algorithms has limited the use of this approach. Here, we present efficient algorithms for SPR and TBR-based local search heuristics for gene tree parsimony under the 1) duplication, 2) loss, 3) duplication-loss, and 4) deep coalescence reconciliation costs. These novel algorithms improve upon the time complexities of previous algorithms for these problems by a factor of n, where n is the number of species in the collection of gene trees. Our algorithms provide a substantial improvement in runtime and scalability compared to previous implementations and enable large-scale gene tree parsimony analyses using any of the four reconciliation costs. Our algorithms have been implemented in the software packages DupTree and iGTP, and have already been used to perform several compelling phylogenetic studies.
Vegetation, Search problems, Phylogeny, Bioinformatics, Algorithm design and analysis, Complexity theory, Genomics
M. S. Bansal and O. Eulenstein, "Algorithms for Genome-Scale Phylogenetics Using Gene Tree Parsimony," in IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 10, no. 4, pp. 939-956, 2013.