CSDL Home IEEE/ACM Transactions on Computational Biology and Bioinformatics 2011 vol.8 Issue No.05 - September/October

Subscribe

Issue No.05 - September/October (2011 vol.8)

pp: 1318-1329

Pedro Feijão , University of Campinas, Campinas

João Meidanis , Scylla Bioinformatics and University of Campinas, Campinas

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TCBB.2011.34

ABSTRACT

The breakpoint distance is one of the most straightforward genome comparison measures. Surprisingly, when it comes to defining it precisely for multichromosomal genomes with both linear and circular chromosomes, there is more than one way to go about it. Pevzner and Tesler gave a definition in a 2003 paper, Tannier et al. defined it differently in 2008, and in this paper we provide yet another alternative, calling it SCJ for single-cut-or-join, in analogy to the popular double cut and join (DCJ) measure. We show that several genome rearrangement problems, such as median and halving, become easy for SCJ, and provide linear and higher polynomial time algorithms for them. For the multichromosomal linear genome median problem, this is the first polynomial time algorithm described, since for other distances this problem is NP-hard. In addition, we show that small parsimony under SCJ is also easy, and can be solved by a variant of Fitch's algorithm. In contrast, big parsimony is NP-hard under SCJ. This new distance measure may be of value as a speedily computable, first approximation to distances based on more realistic rearrangement models.

INDEX TERMS

Biology and genetics, combinatorial algorithms, computations on discrete structures.

CITATION

Pedro Feijão, João Meidanis, "SCJ: A Breakpoint-Like Distance that Simplifies Several Rearrangement Problems",

*IEEE/ACM Transactions on Computational Biology and Bioinformatics*, vol.8, no. 5, pp. 1318-1329, September/October 2011, doi:10.1109/TCBB.2011.34REFERENCES

- [1] A.H. Sturtevant and T. Dobzhansky, “Inversions in the Third Chromosome of Wild Races of Drosophila Pseudoobscura, and Their Use in the Study of the History of the Species,”
Proc. Nat'l Academy of Sciences USA, vol. 22, no. 7, pp. 448-450, 1936.- [2] B. McClintock, “The Origin and Behavior of Mutable Loci in Maize,”
Proc. Nat'l Academy of Sciences of USA, vol. 36, no. 6, pp. 344-355, 1950.- [3] J.H. Nadeau and B.A. Taylor, “Lengths of Chromosomal Segments Conserved Since Divergence of Man and Mouse,”
Proc. Nat'l Academy of Sciences USA, vol. 81, no. 3, pp. 814-818, 1984.- [4] S. Hannenhalli and P.A. Pevzner, “Transforming Cabbage into Turnip: (Polynomial Algorithm for Sorting Signed Permutations by Reversals),”
Proc. 27th Ann. Symp. Theory of Computing (STOC '95), 1995.- [5] S. Hannenhalli, “Polynomial-Time Algorithm for Computing Translocation Distance between Genomes,”
Discrete Applied Math., vol. 71, nos. 1-3, pp. 137-151, 1996.- [6] D.A. Christie, “Sorting Permutations by Block-Interchanges,”
Information Processing Letters, vol. 60, pp. 165-169, 1996.- [7] V. Bafna and P.A. Pevzner, “Sorting by Transpositions,”
SIAM J. Discrete Math., vol. 11, no. 2, pp. 224-240, 1998.- [8] I. Elias and T. Hartman, “A 1.375-Approximation Algorithm for Sorting by Transpositions,”
IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 3, no. 4, pp. 369-379, Oct.-Dec. 2006.- [9] S. Hannenhalli and P.A. Pevzner, “Transforming Men into Mice (Polynomial Algorithm for Genomic Distance Problem),”
FOCS '95: Proc. 36th Ann. Symp. Foundations of Computer Science, pp. 581-592, 1995.- [10] C. Mira and J. Meidanis, “Sorting by Block-Interchanges and Signed Reversals,”
Proc. Fourth Int'l Conf. Information Technology (ITNG '07), pp. 670-676, 2007.- [11] Z. Dias and J. Meidanis, “Genome Rearrangements Distance by Fusion, Fission, and Transposition is Easy,”
Proc. Eighth Symp. String Processing and Information Retrieval (SPIRE '01), pp. 250-253, 2001.- [12] C.L. Lu, Y.L. Huang, T.C. Wang, and H.-T. Chiu, “Analysis of Circular Genome Rearrangement by Fusions, Fissions and Block-Interchanges,”
BMC Bioinformatics, vol. 7, article no. 295, http://dx.doi.org/10.11861471-2105-7-295 , 2006.- [13] S. Yancopoulos, O. Attie, and R. Friedberg, “Efficient Sorting of Genomic Permutations by Translocation, Inversion and Block Interchange,”
Bioinformatics, vol. 21, no. 16, pp. 3340-3346, http://dx.doi.org/10.1093/bioinformatics bti535, 2005.- [14] S. Hannenhalli, C. Chappey, E.V. Koonin, and P.A. Pevzner, “Genome Sequence Comparison and Scenarios for Gene Rearrangements: A Test Case,”
Genomics, vol. 30, no. 2, pp. 299-311, 1995.- [15] D. Sankoff, G. Sundaram, and J.D. Kececioglu, “Steiner Points in the Space of Genome Rearrangements,”
Int'l J. Foundations of Computer Science, vol. 7, no. 1, pp. 1-9, citeseer.ist.psu.edusankoff96steiner.html , 1996.- [16] E. Tannier, C. Zheng, and D. Sankoff, “Multichromosomal Median and Halving Problems under Different Genomic Distances,”
BMC Bioinformatics, vol. 10, no. 1,article no. 120, http://dx.doi.org/10.11861471-2105-10-120 , Apr. 2009.- [17] M. Blanchette, G. Bourque, and D. Sankoff, “Breakpoint Phylogenies,”
Proc. Genome Informatics Ser Workshop Genome Informatics, vol. 8, pp. 25-34, 1997.- [18] D. Sankoff and M. Blanchette, “Multiple Genome Rearrangement and Breakpoint Phylogeny,”
J. Computational Biology, vol. 5, no. 3, pp. 555-570, 1998.- [19] B.M. Moret, L.S. Wang, T. Warnow, and S.K. Wyman, “New Approaches for Reconstructing Phylogenies from Gene Order Data,”
Bioinformatics, vol. 17, Suppl 1, pp. S165-S173, 2001.- [20] D.A. Bader, B.M. Moret, and M. Yan, “A Linear-Time Algorithm for Computing Inversion Distance between Signed Permutations with an Experimental Study,”
J. Computational Biology, vol. 8, no. 5, pp. 483-491, http://dx.doi.org/10.1089106652701753216503 , 2001.- [21] B.M. Moret, A.C. Siepel, J. Tang, and T. Liu, “Inversion Medians Outperform Breakpoint Medians in Phylogeny Reconstruction from Gene-Order Data,”
WABI: Proc. Second Int'l Workshop Algorithms in Bioinformatics, pp. 521-536, 2002.- [22] D.H. Huson, S.M. Nettles, and T.J. Warnow, “Disk-Covering, a Fast-Converging Method for Phylogenetic Tree Reconstruction,”
J. Computational Biology, vol. 6, nos. 3/4, pp. 369-386, http://dx.doi.org/10.1089106652799318337 , 1999.- [23] J. Tang and B.M.E. Moret, “Scaling Up Accurate Phylogenetic Reconstruction from Gene-Order Data,”
Bioinformatics, vol. 19, Suppl 1, pp. i305-i312, 2003.- [24] G. Bourque and P.A. Pevzner, “Genome-Scale Evolution: Reconstructing Gene Orders in the Ancestral Species,”
Genome Research, vol. 12, no. 1, pp. 26-36, 2002.- [25] M. Bernt, D. Merkle, and M. Middendorf, “Using Median Sets for Inferring Phylogenetic Trees,”
Bioinformatics, vol. 23, no. 2, pp. e129-e135, http://dx.doi.org/10.1093/bioinformatics btl300, Jan. 2007.- [26] Z. Adam and D. Sankoff, “The ABCs of MGR with DCJ,”
Evolutionary Bioinformatics Online, vol. 4, pp. 69-74, 2008.- [27] M. Bader, M.I. Abouelhoda, and E. Ohlebusch, “A Fast Algorithm for the Multiple Genome Rearrangement Problem with Weighted Reversals and Transpositions,”
BMC Bioinformatics, vol. 9, p. 516, http://dx.doi.org/10.11861471-2105-9-516 , 2008.- [28] J. Ma, L. Zhang, B.B. Suh, B.J. Raney, R.C. Burhans, W.J. Kent, M. Blanchette, D. Haussler, and W. Miller, “Reconstructing Contiguous Regions of an Ancestral Genome,”
Genome Research, vol. 16, no. 12, pp. 1557-1565, http://dx.doi.org/10.1101gr.5383506, Dec. 2006.- [29] W.M. Fitch, “Toward Defining the Course of Evolution: Minimum Change for a Specific Tree Topology,”
Systematic Zoology, vol. 20, pp. 406-416, 1971.- [30] H. Zhao and G. Bourque, “Recovering True Rearrangement Events on Phylogenetic Trees,”
Comparative Genomics, G. Tesler and D. Durand, eds., Springer Berlin, pp. 149-161, http://dx.doi.org/10.1007978-3-540-74960-8-12 , 2007.- [31] H. Zhao and G. Bourque, “Recovering Genome Rearrangements in the Mammalian Phylogeny,”
Genome Research, vol. 19, no. 5, pp. 934-942, http://dx.doi.org/10.1101gr.086009.108, May 2009.- [32] M.A. Alekseyev and P.A. Pevzner, “Breakpoint Graphs and Ancestral Genome Reconstructions,”
Genome Research, vol. 19, no. 5, pp. 943-957, http://dx.doi.org/10.1101gr.082784.108, May 2009.- [33] A. Caprara, “On the Tightness of the Alternating-Cycle Lower Bound for Sorting by Reversals,”
J. Combinatorial Optimization, vol. 3, pp. 149-182, http://dx.doi.org/10.1023A:1009838309166 , 1999,- [34] P. Medvedev and J. Stoye, “Rearrangement Models and Single-Cut Operations,”
RECOMB-CG '09: Proc. Int'l Workshop Comparative Genomics, pp. 84-97, 2009.- [35] P. Pevzner and G. Tessler, “Transforming Men into Mice: The Nadeau-Taylor Chromosomal Breakage Model Revisited,”
RECOMB, pp. 247-256, ACM Press, 2003.- [36] E. Tannier, C. Zheng, and D. Sankoff, “Multichromosomal Genome Median and Halving Problems,”
WABI '08: Proc. Eighth Int'l Workshop Algorithms in Bioinformatics, pp. 1-13, 2008.- [37] A. Bergeron, J. Mixtacki, and J. Stoye, “A Unifying View of Genome Rearrangements,”
WABI '06: Proc. Sixth Int'l Workshop Algorithms in Bioinformatics, pp. 163-173, 2006.- [38] R. Warren and D. Sankoff, “Genome Aliquoting with Double Cut and Join,”
BMC Bioinformatics, vol. 10, Suppl 1, p. S2, http://dx.doi.org/10.11861471-2105-10-S1-S2 , 2009.- [39] I. Pe'er and R. Shamir, “The Median Problems for Breakpoints Are NP-Complete,”
Electronic Colloquium on Computational Complexity , vol. 71, no. 5, pp. 1-16, 1998.- [40] D. Bryant, “The Complexity of the Breakpoint Median Problem,” Technical Report CRM-2579, Centre de recherches mathematiques, Université de Montréal, 1998.
- [41] A. Caprara, “The Reversal Median Problem,”
INFORMS J. Computing, vol. 15, pp. 93-113, 2003.- [42] L. Lovász and M.D. Plummer, “Matching Theory,”
Annals of Discrete Mathematics, vol. 29, North-Holland, 1986.- [43] S. Ohno,
Evolution by Gene Duplication. Springer-Verlag, 1970.- [44] M. Kellis, B.W. Birren, and E.S. Lander, “Proof and Evolutionary Analysis of Ancient Genome Duplication in the Yeast Saccharomyces Cerevisiae,”
Nature, vol. 428, no. 6983, pp. 617-624, http://dx.doi.org/10.1038nature02424, 2004.- [45] M.A. Alekseyev and P.A. Pevzner, “Colored de Bruijn Graphs and the Genome Halving Problem,”
IEEE/ACM Trans. Computational Biology Bioinformatics, vol. 4, no. 1, pp. 98-107, http://dx.doi.org/10.1109TCBB.2007.1002, Jan./Mar. 2007.- [46] J. Mixtacki, “Genome Halving under DCJ Revisited,”
COCOON '08: Proc. 14th Ann. Int'l Conf. Computing and Combinatorics, pp. 276-286, http://www.springerlink.com/contentqr11k3075461263h /, 2008.- [47] C. Zheng, Q. Zhu, Z. Adam, and D. Sankoff, “Guided Genome Halving: Hardness, Heuristics and the History of the Hemiascomycetes,”
Bioinformatics, vol. 24, no. 13, pp. i96-104, http://dx.doi.org/10.1093/bioinformatics btn146, 2008.- [48] P. Feijão and J. Meidanis, “SCJ : A Variant of Breakpoint Distance for Which Sorting, Genome Median and Genome Halving Problems Are Easy,”
WABI '09: Proc. Ninth Int'l Conf. Algorithms in Bioinformatics, pp. 85-96, 2009.- [49] L.R. Foulds and R.L. Graham, “The Steiner Problem in Phylogeny is NP-Complete,”
Advances in Applied Math., vol. 3, pp. 43-49, 1982.- [50] R. Warren and D. Sankoff, “Genome Halving with Double Cut and Join,”
J. Bioinformatics Computational Biology, vol. 7, no. 2, pp. 357-371, Apr. 2009. |