The Community for Technology Leaders
Subscribe
Issue No.04 - July/August (2011 vol.8)
pp: 1134-1140
Yang Chen , Waseda University, Fukuoka
Jinglu Hu , Waseda University, Fukuoka
ABSTRACT
Sequencing by hybridization is a promising cost-effective technology for high-throughput DNA sequencing via microarray chips. However, due to the effects of spectrum errors rooted in experimental conditions, an accurate and fast reconstruction of original sequences has become a challenging problem. In the last decade, a variety of analyses and designs have been tried to overcome this problem, where different strategies have different trade-offs in speed and accuracy. Motivated by the idea that the errors could be identified by analyzing the interrelation of spectrum elements, this paper presents a constructive heuristic algorithm, featuring an accurate reconstruction guided by a set of well-defined criteria and rules. Instead of directly reconstructing the original sequence, the new algorithm first builds several accurate short fragments, which are then carefully assembled into a whole sequence. The experiments on benchmark instance sets demonstrate that the proposed method can reconstruct long DNA sequences with higher accuracy than current approaches in the literature.
INDEX TERMS
DNA sequencing, sequencing by hybridization, heuristic algorithm, bioinformatics, microarrays.
CITATION
Yang Chen, Jinglu Hu, "Accurate Reconstruction for DNA Sequencing by Hybridization Based on a Constructive Heuristic", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.8, no. 4, pp. 1134-1140, July/August 2011, doi:10.1109/TCBB.2010.89
REFERENCES
 [1] S. Anderson, “Shotgun DNA Sequencing Using Cloned DNase I-Generated Fragments,” Nucleic Acids Research, vol. 9, no. 13, pp. 3015-3027, 1981. [2] J. Blazewicz, J. Kaczmarek, M. Kasprzak, W.T. Markiewicz, and J. Weglarz, “Sequential and Parallel Algorithms for DNA Sequencing,” Computer Applications in Biosciences, vol. 13, no. 2, pp. 151-158, 1997. [3] J. Blazewicz, P. Formanowicz, M. Kasprzak, W.T. Markiewicz, and J. Weglarz, “DNA Sequencing with Positive and Negative Errors,” J. Computational Biology, vol. 6, no. 1, pp. 113-123, 1999. [4] J. Blazewicz, P. Formanowicz, F. Guinand, and M. Kasprzak, “A Heuristic Managing Errors for DNA Sequencing,” Bioinformatics, vol. 18, no. 5, pp. 652-660, 2002. [5] J. Blazewicz and M. Kasprzak, “Complexity of DNA Sequencing by Hybridization,” Theoretical Computer Science, vol. 290, no. 3, pp. 1459-1473, 2003. [6] J. Blazewicz, P. Formanowicz, M. Kasprzak, and W.T. Markiewicz, “Sequencing by Hybridization with Isothermic Oligonucleotide Libraries,” Discrete Applied Math., vol. 145, no. 1, pp. 40-51, 2004. [7] J. Blazewicz, F. Glover, and M. Kasprzak, “Evolutionary Approaches to DNA Sequencing with Errors,” Annals Operations Research, vol. 138, pp. 408-415, 2005. [8] C. Blum and M.Y. Valles, “New Constructive Heuristics for DNA Sequencing by Hybridization,” Lecture Notes Bioinformatics, vol. 4175, pp. 355-365, 2006. [9] C. Blum, M.Y. Valles, and M.J. Blesa, “An Ant Colony Optimization Algorithm for DNA Sequencing by Hybridization,” Computers Operations Research, vol. 35, no. 11, pp. 3620-3635, 2008. [10] C.A. Brizuela, L.C. Gonza'lez-Gurrola, A. Tchernykh, and D. Trystram, “Sequencing by Hybridization: An Enhanced Crossover Operator for a Hybrid Genetic Algorithm,” J. Heuristics, vol. 13, no. 3, pp. 209-225, 2007. [11] G.M. Church, “Genomes for All,” Scientific Am., vol. 294, no. 1, pp. 46-54, 2006. [12] R. Drmanac, I. Labat, I. Brukner, and R. Crkvenjakov, “Sequencing of Megabase Plus DNA by Hybridization: Theory of the Method,” Genomics, vol. 4, no. 2, pp. 114-128, 1989. [13] T.A. Endo, “Probabilistic Nucleotide Assembling Method for Sequencing by Hybridization,” Bioinformatics, vol. 20, no. 14, pp. 2181-2188, 2004. [14] A.M. Frieze, F.P. Preparata, and E. Upfal, “Optimal Reconstruction of a Sequence from Its Probes,” J. Computational Biology, vol. 6, nos. 3/4, pp. 361-368, 1999. [15] D. Gresham, M.J. Dunham, and D. Botstein, “Comparing Whole Genomes Using DNA Microarrays,” Nature Rev. Genetics, vol. 9, pp. 291-302, 2008. [16] R.J. Lipshutz, “Likelihood DNA Sequencing by Hybridization,” J. Biomolecular Structure Dynamics, vol. 11, pp. 637-653, 1993. [17] P.M. Lizardi, “Next-Generation Sequencing-by-Hybridization,” Nature Biotechnology, vol. 26, no. 6, pp. 649-650, 2008. [18] S.B. Needleman and C.D. Wunsch, “A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins,” J. Molecular Biology, vol. 48, no. 3, pp. 443-453, 1970. [19] P.A. Pevzner, “$l$ -Tuple DNA Sequencing: Computer Analysis,” J. Biomolecular Structure Dynamics, vol. 7, pp. 63-73, 1989. [20] F.P. Preparata, “Sequencing-by-Hybridization Revisited: The Analog-Spectrum Proposal,” IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 1, no. 1, pp. 46-52, Jan.-Mar. 2004. [21] F. Sanger, S. Nicklen, and A.R. Coulson, “DNA Sequencing with Chain-Terminating Inhibitors,” Proc. Nat'l Academy of Sciences USA, vol. 74, no. 12, pp. 5463-5467, 1977. [22] T.F. Smith and M.S. Waterman, “Identification of Common Molecular Subsequences,” J. Molecular Biology, vol. 147, pp. 195-197, 1981. [23] J.H. Zhang, L.Y. Wu, and X.S. Zhang, “Reconstruction of DNA Sequencing by Hybridization,” Bioinformatics, vol. 19, no. 1, pp. 14-21, 2003.