The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.02 - March-April (2013 vol.10)
pp: 352-360
Serdar Bozdag , Dept. of Math., Stat. & Comput. Sci., Marquette Univ., Milwaukee, WI, USA
Timothy J. Close , Dept. of Botany & Plant Sci., Univ. of California, Riverside, Riverside, CA, USA
Stefano Lonardi , Dept. of Comput. Sci. & Eng., Univ. of California, Riverside, Riverside, CA, USA
ABSTRACT
The problem of computing the minimum tiling path (MTP) from a set of clones arranged in a physical map is a cornerstone of hierarchical (clone-by-clone) genome sequencing projects. We formulate this problem in a graph theoretical framework, and then solve by a combination of minimum hitting set and minimum spanning tree algorithms. The tool implementing this strategy, called FMTP, shows improved performance compared to the widely used software FPC. When we execute FMTP and FPC on the same physical map, the MTP produced by FMTP covers a higher portion of the genome, and uses a smaller number of clones. For instance, on the rice genome the MTP produced by our tool would reduce by about 11 percent the cost of a clone-by-clone sequencing project. Source code, benchmark data sets, and documentation of FMTP are freely available at http://code.google.com/p/fingerprint-basedminimal-tiling-path/ under MIT license.
INDEX TERMS
Physical mapping, Minimum tiling path,FMTP, FPC, Minimum hitting set, minimum tiling path, physical mapping
CITATION
Serdar Bozdag, Timothy J. Close, Stefano Lonardi, "A Graph-Theoretical Approach to the Selection of the Minimum Tiling Path from a Physical Map", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.10, no. 2, pp. 352-360, March-April 2013, doi:10.1109/TCBB.2013.26
REFERENCES
[1] E.D. Green, "Strategies for the Systematic Sequencing of Complex Genomes," Nature Rev. Genetics, vol. 2, pp. 573-583, 2001.
[2] M. Marra, T. Kucaba, M. Sekhon, L. Hillier, R. Martienssen, A. Chinwalla, J.m. Crockett, J. Fedele, H. Grover, C. Gund, W.R. McCombie, K. McDonald, J. McPherson, N. Mudd, L. Parnell, J. Schein, R. Seim, P. Shelby, R. Waterston, and R. Wilson, "A Map for Sequence Analysis of the Arabidopsis Thaliana Genome," Nature Genetics, vol. 22, no. 3, pp. 265-270, 1999.
[3] J.D. McPherson et al., "A Physical Map of the Human Genome," Nature, vol. 409, pp. 934-941, 2001.
[4] R.L. Warren, D. Varabei, D. Platt, X. Huang, D. Messina, S.-P. Yang, J.W. Kronstad, M. Krzywinski, W.C. Warren, J.W. Wallis, L.W. Hillier, A.T. Chinwalla, J.E. Schein, A.S. Siddiqui, M.A. Marra, R.K. Wilson, and S.J.M. Jones, "Physical Map-Assisted Whole-Genome Shotgun Sequence Assemblies," Genome Research, vol. 16, no. 6, pp. 768-775, June 2006.
[5] S.G. Gregory, M. Sekhon, J. Schein, S. Zhao, K. Osoegawa, C.E. Scott, R.S. Evans, P.W. Burridge, T.V. Cox, C.A. Fox, R.D. Hutton, I.R. Mullenger, K.J. Phillips, J. Smith, J. Stalker, G.J. Threadgold, E. Birney, K. Wylie, A. Chinwalla, J. Wallis, L. Hillier, J. Carter, T. Gaige, S. Jaeger, C. Kremitzki, D. Layman, J. Maas, R. McGrane, K. Mead, R. Walker, S. Jones, M. Smith, J. Asano, I. Bosdet, S. Chan, S. Chittaranjan, R. Chiu, C. Fjell, D. Fuhrmann, N. Girn, C. Gray, R. Guin, L. Hsiao, M. Krzywinski, R. Kutsche, S.S. Lee, C. Mathewson, C. McLeavy, S. Messervier, S. Ness, P. Pandoh, A.-L. Prabhu, P. Saeedi, D. Smailus, L. Spence, J. Stott, S. Taylor, W. Terpstra, M. Tsai, J. Vardy, N. Wye, G. Yang, S. Shatsman, B. Ayodeji, K. Geer, G. Tsegaye, A. Shvartsbeyn, E. Gebregeorgis, M. Krol, D. Russell, L. Overton, J.A. Malek, M. Holmes, M. Heaney, J. Shetty, T. Feldblyum, W.C. Nierman, J.J. Catanese, T. Hubbard, R.H. Waterston, J. Rogers, P.J. de Jong, C.M. Fraser, M. Marra, J.D. McPherson, and D.R. Bentley, "A Physical Map of the Mouse Genome," Nature, vol. 418, no. 6899, pp. 743-750, 2002.
[6] M. Krzywinski, J. Wallis, C. Gösele, I. Bosdet, R. Chiu, T. Graves, O. Hummel, D. Layman, C. Mathewson, N. Wye, B. Zhu, D. Albracht, J. Asano, S. Barber, M. Brown-John, S. Chan, S. Chand, A. Cloutier, J. Davito, C. Fjell, T. Gaige, D. Ganten, N. Girn, K. Guggenheimer, H. Himmelbauer, T. Kreitler, S. Leach, D. Lee, H. Lehrach, M. Mayo, K. Mead, T. Olson, P. Pandoh, A.-L. Prabhu, H. Shin, S. Tänzer, J. Thompson, M. Tsai, J. Walker, G. Yang, M. Sekhon, L. Hillier, H. Zimdahl, A. Marziali, K. Osoegawa, S. Zhao, A. Siddiqui, P.J. de Jong, W. Warren, E. Mardis, J.D. McPherson, R. Wilson, N. Hübner, S. Jones, M. Marra, and J. Schein, "Integrated and Sequence-Ordered BAC- and YAC-Based Physical Maps for the Rat Genome," Genome Research, vol. 14, no. 4, pp. 766-779, Apr. 2004.
[7] C. Ren, M.-K. Lee, B. Yan, K. Ding, B. Cox, M.N. Romanov, J.A. Price, J.B. Dodgson, and H.-B. Zhang, "A BAC-Based Physical Map of the Chicken Genome," Genome Research, vol. 13, no. 12, pp. 2754-2758, Dec. 2003.
[8] J.C. Venter, H.O. Smith, and L. Hood, "A New Strategy for Genome Sequencing," Nature, vol. 381, no. 6581, pp. 364-366, 1996.
[9] Z. Frenkel, E. Paux, D. Mester, C. Feuillet, and A. Korol, "LTC: A Novel Algorithm to Improve the Efficiency of Contig Assembly for Physical Mapping in Complex Genomes," BMC Bioinformatics, vol. 11, no. 1, article 584, 2010.
[10] W. Nelson and C. Soderlund, "Integrating Sequence with FPC Fingerprint Maps," Nucleic Acids Research, vol. 37, no. 5, article e36, Apr. 2009.
[11] S. Bozdag, T. Close, and S. Lonardi, "Computing the Minimal Tiling Path from a Physical Map by Integer Linear Programming," Proc. Eighth Int'l Workshop on Algorithms in Bioinformatics, pp. 148-161, 2008.
[12] C. Soderlund, S. Humphray, A. Dunham, and L. French, "Contigs Built with Fingerprints, Markers, and FPC V4.7," Genome Research, vol. 10, no. 11, pp. 1772-1787, Nov. 2000.
[13] J. Sulston, F. Mallett, R. Staden, R. Durbin, T. Horsnell, and A. Coulson, "Software for Genome Mapping by Fingerprinting Techniques," Computer Application Biosciences, vol. 4, no. 1, pp. 125-132, Mar. 1988.
[14] J. Edmonds and R.M. Karp, "Theoretical Improvements in Algorithmic Efficiency for Network Flow Problems," J. ACM, vol. 19, no. 2, pp. 248-264, 1972.
[15] J. Hao and J.B. Orlin, "A Faster Algorithm for Finding the Minimum Cut in a Directed Graph," J. Algorithms, vol. 17, no. 3, pp. 424-446, 1994.
[16] M.R. Garey and D.S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman, 1979.
[17] Y. Wu, P.R. Bhat, T.J. Close, and S. Lonardi, "Efficient and Accurate Construction of Genetic Linkage Maps from the Minimum Spanning Tree of a Graph," PLoS Genetics, vol. 4, no. 10, article e1000212, 2008.
[18] Int'l Barley Genome Sequencing Consortium, "A physical, genetic and functional sequence assembly of the barley genome," Nature, vol. 491, no. 7426, pp. 711-716, 2012.
[19] S. Bozdag, T. Close, and S. Lonardi, "A Compartmentalized Approach to the Assembly of Physical Maps," BMC Bioinformatics, vol. 10, no. 1, article 217, 2009.
[20] F.W. Engler, J. Hatfield, W. Nelson And C.A. Soderlund,, "Locating Sequence on FPC Maps and Selecting a Minimal Tiling Path," Genome Research, vol. 13, no. 9, pp. 2152-2163, Sept. 2003.
[21] W.M. Nelson, A.K. Bharti, E. Butler, F. Wei, G. Fuks, H. Kim, R.A. Wing, J. Messing, and C. Soderlund, "Whole-Genome Validation of High-Information-Content Fingerprinting," Plant Physiology, vol. 139, no. 1, pp. 27-38, Sept. 2005.
16 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool