The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.01 - Jan.-Feb. (2013 vol.10)
pp: 122-134
Priscila Biller , Inst. of Comput.., Univ. of Campinas, Campinas, Brazil
Pedro Feijao , Inst. of Comput.., Univ. of Campinas, Campinas, Brazil
Joao Meidanis , Inst. of Comput.., Univ. of Campinas, Campinas, Brazil
ABSTRACT
Recently, the Single-Cut-or-Join (SCJ) operation was proposed as a basis for a new rearrangement distance between multichromosomal genomes, leading to very fast algorithms, both in theory and in practice. However, it was not clear how well this new distance fares when it comes to using it to solve relevant problems, such as the reconstruction of evolutionary history. In this paper, we advance current knowledge, by testing SCJ's ability regarding evolutionary reconstruction in two aspects: 1) How well does SCJ reconstruct evolutionary topologies? and 2) How well does SCJ reconstruct ancestral genomes? In the process of answering these questions, we implemented SCJ-based methods, and made them available to the community. We ran experiments using as many as 200 genomes, with as many as 3,000 genes. For the first question, we found out that SCJ can recover typically between 60 percent and more than 95 percent of the topology, as measured through the Robinson-Foulds distance (a.k.a. split distance) between trees. In other words, 60 percent to more than 95 percent of the original splits are also present in the reconstructed tree. For the second question, given a topology, SCJ's ability to reconstruct ancestral genomes depends on how far from the leaves the ancestral is. For nodes close to the leaves, about 85 percent of the gene adjacencies can be recovered. This percentage decreases as we move up the tree, but, even at the root, about 50 percent of the adjacencies are recovered, for as many as 64 leaves. Our findings corroborate the fact that SCJ leads to very conservative genome reconstructions, yielding very few false-positive gene adjacencies in the ancestrals, at the expense of a relatively larger amount of false negatives. In addition, experiments with real data from the Campanulaceae and Protostomes groups show that SCJ reconstructs topologies of quality comparable to the accepted trees of the species involved. As far as time is concerned, the methods we implemented can find a topology for 64 genomes with 2,000 genes each in about 10.7 minutes, and reconstruct the ancestral genomes in a 64-leaf tree in about 3 seconds, both on a typical desktop computer. It should be noted that our code is written in Java and we made no significant effort to optimize it.
INDEX TERMS
Genomics, Vegetation, Topology, Phylogeny, Extremities, Biological cells, Polynomials,phylogeny, Genome rearrangement
CITATION
Priscila Biller, Pedro Feijao, Joao Meidanis, "Rearrangement-Based Phylogeny Using the Single-Cut-or-Join Operation", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.10, no. 1, pp. 122-134, Jan.-Feb. 2013, doi:10.1109/TCBB.2012.168
REFERENCES
[1] A.H. Sturtevant and T. Dobzhansky, “Inversions in the Third Chromosome of Wild Races of Drosophila pseudoobscura, and Their Use in the Study of the History of the Species,” Proc. Nat'l Academy of Sciences USA, vol. 22, no. 7, pp. 448-450, 1936.
[2] B. McClintock, “The Origin and Behavior of Mutable Loci in Maize,” Proc. Nat'l Academy of Sciences USA, vol. 36, no. 6, pp. 344-355, 1950.
[3] J.H. Nadeau and B.A. Taylor, “Lengths of Chromosomal Segments Conserved since Divergence of Man and Mouse,” Proc. Nat'l Academy of Sciences USA, vol. 81, no. 3, pp. 814-818, 1984.
[4] E. Tannier, C. Zheng, and D. Sankoff, “Multichromosomal Median and Halving Problems under Different Genomic Distances,” BMC Bioinformatics, vol. 10, no. 1, article 120, Apr. 2009.
[5] S. Hannenhalli, C. Chappey, E.V. Koonin, and P.A. Pevzner, “Genome Sequence Comparison and Scenarios for Gene Rearrangements: A Test Case,” Genomics, vol. 30, no. 2, pp. 299-311, 1995.
[6] D. Sankoff, G. Sundaram, and J.D. Kececioglu, “Steiner Points in the Space of Genome Rearrangements,” Int'l J. Foundations of Computer Science, vol. 7, no. 1, pp. 1-9, 1996.
[7] D. Sankoff and M. Blanchette, “Multiple Genome Rearrangement and Breakpoint Phylogeny,” J. Computational Biology, vol. 5, no. 3, pp. 555-570, 1998.
[8] B.M. Moret, L.S. Wang, T. Warnow, and S.K. Wyman, “New Approaches for Reconstructing Phylogenies from Gene Order Data.” Bioinformatics, vol. 17, Suppl 1, pp. S165-S173, 2001.
[9] B.M. Moret, A.C. Siepel, J. Tang, and T. Liu, “Inversion Medians Outperform Breakpoint Medians in Phylogeny Reconstruction from Gene-Order Data,” Proc. Second Int'l Workshop Algorithms in Bioinformatics (WABI '02), pp. 521-536, 2002.
[10] D.A. Bader, B.M. Moret, and M. Yan, “A Linear-Time Algorithm for Computing Inversion Distance between Signed Permutations with an Experimental Study,” J. Computational Biology, vol. 8, no. 5, pp. 483-491, 2001.
[11] G. Bourque and P.A. Pevzner, “Genome-Scale Evolution: Reconstructing Gene Orders in the Ancestral Species,” Genome Research, vol. 12, no. 1, pp. 26-36, 2002.
[12] P. Feijão and J. Meidanis, “SCJ: A Breakpoint-Like Distance that Simplifies Several Rearrangement Problems,” IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 8, no. 5, pp. 1318-1329, Sept. 2011.
[13] A. Bergeron, J. Mixtacki, and J. Stoye, “A Unifying View of Genome Rearrangements,” Proc. Sixth Int'l Workshop Algorithms in Bioinformatics (WABI '06), vol. 4175, pp. 163-173, 2006.
[14] M. Blanchette, G. Bourque, and D. Sankoff, “Breakpoint Phylogenies,” Genome Informatics, vol. 8, pp. 25-34, 1997.
[15] J. Stoye and R. Wittler, “A Unified Approach for Reconstructing Ancient Gene Clusters,” IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 6, no. 3, pp. 387-400, July-Sept. 2009.
[16] J. Kováč, “On the Complexity of Rearrangement Problems under the Breakpoint Distance,” Rapid Post, pp. 1-15, 2011.
[17] L.R. Foulds and R.L. Graham, “The Steiner Problem in Phylogeny is NP-Complete,” Advances in Applied Math., vol. 3, pp. 43-49, 1982.
[18] D. Aldous, “Stochastic Models and Descriptive Statistics for Phylogenetic Trees, from Yule to Today,” Statistical Science, vol. 16, pp. 23-34, 2001.
[19] Y.-L. Huang, C.-C. Huang, C.Y. Tang, and C.L. Lu, “SoRT2: A Tool for Sorting Genomes and Reconstructing Phylogenetic Trees by Reversals, Generalized Transpositions and Translocations,” Nucleic Acids Research, vol. 38, pp. 221-227, 2010.
[20] J. Shi and J. Tang, “An Experimental Evaluation of Corrected Inversion and DCJ Distance Metric through Simulations,” Proc. Fourth Int'l Conf. Bioinformatics and Biomedical Eng. (iCBBE '10), pp. 1-4, 2010,
[21] J. Ma, L. Zhang, B.B. Suh, B.J. Raney, R.C. Burhans, W.J. Kent, M. Blanchette, D. Haussler, and W. Miller, “Reconstructing Contiguous Regions of an Ancestral Genome,” Genome Research, vol. 16, no. 12, pp. 1557-1565, Dec. 2006.
[22] W. Xu and B. Moret, “GASTS: Parsimony Scoring under Rearrangements,” Proc. 11th Int'l Conf. Algorithms in Bioinformatics, vol. 6833, pp. 351-363, 2011.
[23] M. Cosner, R. Jansen, and B. Moret, An Empirical Comparison of Phylogenetic Methods on Chloroplast Gene Order Data in Campanulaceae. Kluwer Academic, pp. 99-121, 2000.
[24] Z. Adam and D. Sankoff, “The ABCs of MGR with DCJ,” Evol. Bioinform. Online, vol. 4, pp. 69-74, 2008.
[25] J.. Kováč, B. Brejová, and T. Vinař, “A New Approach to the Small Phylogeny Problem,” ArXiv e-prints, technical report, 2010.
[26] J. Felsenstein, PHYLIP (Phylogeny Inference Package) Version 3.69, Distributed by the Author, 2005.
[27] M.E. Cosner, L.A. Raubeson, and R.K. Jansen, “Chloroplast DNA Rearrangements in Campanulaceae: Phylogenetic Utility of Highly Rearranged Genomes,” BMC Evolutionary Biology, vol. 4, article 27, 2004.
[28] G. Fritzsch, M. Schlegel, and P.F. Stadler, “Alignments of Mitochondrial Genome Arrangements: Applications to Metazoan Phylogeny,” J. Theoretical Biology, vol. 240, no. 4, pp. 511-520, June 2006.
[29] M. Bernt, D. Merkle, and M. Middendorf, “Using Median Sets for Inferring Phylogenetic Trees,” Bioinformatics, vol. 23, no. 2, pp. e129-e135, 2007.
[30] M. Blanchette, T. Kunisawa, and D. Sankoff, “Gene Order Breakpoint Evidence in Animal Mitochondrial Phylogeny,” J. Molecular Evolution, vol. 49, no. 2, pp. 193-203, Aug. 1999.
[31] W.M. Fitch, “Toward Defining the Course of Evolution: Minimum Change for a Specific Tree Topology,” Systematic Zoology, vol. 20, pp. 406-416, 1971.
[32] G. Fertin, A. Labarre, I. Rusu, E. Tannier, and S. Vialette, Combinatorics of Genome Rearrangements, p. 312, MIT Press, 2009.
[33] J. Felsenstein, Inferring Phylogenies. Sinauer Assoc., 2004.
[34] N. Saitou and M. Nei, “The Neighbor-Joining Method: A New Method for Reconstructing Phylogenetic Trees,” Molecular Biology and Evolution, vol. 4, no. 4, pp. 406-425, July 1987.
[35] D.F. Robinson and L.R. Foulds, “Comparison of Phylogenetic Trees,” Mathematical Biosciences, vol. 53, nos. 1/2, pp. 131-147, 1981.
[36] P. Puigbò, S. Garcia-Vallvé, and J.O. McInerney, “TOPD/FMTS: A New Software to Compare Phylogenetic Trees,” Bioinformatics, vol. 23, pp. 1556-1558, 2007.
[37] L. Nakhleh, B. Moret, U. Roshan, K. John, J. Sun, and T. Warnow, “The Accuracy of Fast Phylogenetic Methods for Large Data Sets,” Proc. Pacific Symp. Biocomputing (PSB '02), pp. 211-222, 2002,
[38] I. Letunic and P. Bork, “Interactive Tree of Life (iTOL): An Online Tool for Phylogenetic Tree Display and Annotation,” Bioinformatics, vol. 23, pp. 127-128, 2007.
[39] J. Tang and B.M.E. Moret, “Scaling Up Accurate Phylogenetic Reconstruction from Gene-Order Data,” Bioinformatics, vol. 19, Suppl 1, pp. i305-i312, 2003.
46 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool