This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Sequencing-by-Hybridization Revisited: The Analog-Spectrum Proposal
January-March 2004 (vol. 1 no. 1)
pp. 46-52
All published approaches to DNA sequencing by hybridization (SBH) consist of the biochemical acquisition of the spectrum of a target sequence (the set of its subsequences conforming to a given probing pattern) followed by the algorithmic reconstruction of the sequence from its spectrum. In the "standard” or "uniform” approach, the probing pattern is a string of length L and the length of reliably reconstructible sequences is known to be m_{len}=O(2^L). For a fixed microarray area, higher sequencing performance can be achieved by inserting nonprobing gaps ("wild-cards”) in the probing pattern. The reconstruction, however, must cope with the emergence of fooling probes due to the gaps and algorithmic failure occurs when the spectrum becomes too densely populated, although we can achieve m_{comp}=O(4^L). Despite the combinatorial success of gapped probing, all current approaches are based on a biochemically unrealistic spectrum-acquisition model (digital-spectrum). The reality of hybridization is much more complex. Departing from the conventional model, in this paper, we propose an alternative, called the analog-spectrum model, which more closely reflects the biochemical process. This novel modeling reestablishes probe length as the performance-governing factor, adopting "semidegenerate bases” as suitable emulators of currently inadequate universal bases. One important conclusion is that accurate biochemical measurements are pivotal to the success of SBH. The theoretical proposal presented in this paper should be a convincing stimulus for the needed biotechnological work.

[1] H.T. Allawi and J. SantaLucia Jr., Thermodynamics and NMR of Internal G-T Mismatches in DNA Biochemistry, vol. 36, pp. 10581-10594, 1997.
[2] H.T. Allawi and J. SantaLucia Jr., Nearest-Neighbor Thermodynamic Parameters for Internal G-A Mismatches in DNA Biochemistry, vol. 37, pp. 2170-2179, 1998.
[3] H.T. Allawi and J. SantaLucia Jr., Thermodynamics of Internal C-T Mismatches in DNA Nucleic Acid Research, vol. 26, pp. 2694-2701, 1998.
[4] H.T. Allawi and J. SantaLucia Jr., Nearest-Neighbor Thermodynamics of Internal A-C Mismatches in DNA Biochemistry, vol. 37, pp. 9435-9444, 1998.
[5] W. Bains and G.C. Smith, A Novel Method for DNA Sequence Determination J. Theoretical Biology, vol. 135, pp. 303-307, 1988.
[6] R. Drmanac, I. Labat, I. Bruckner, and R. Crkvenjakov, Sequencing of Megabase Plus DNA by Hybridization Genomics, vol. 4, pp. 114-128, 1989.
[7] M.E. Dyer, A.M. Frieze, and S. Suen, The Probability of Unique Solutions of Sequencing by Hybridization J. Computational Biology, vol. 1, pp. 105-110, 1994.
[8] W. Feller, An Introduction to Probability Theory and Its Applications. New York: J. Wiley and Sons, 1960.
[9] D. Loakes, The Application of Universal DNA Base Analogues Nucleic Acids Research, vol. 29, pp. 2437-2447, 2001.
[10] Y.P. Lysov, V.L. Florentiev, A.A. Khorlin, K.R. Khrapko, V.V. Shih, and A.D. Mirzabekov, Sequencing by Hybridization via Oligonucleotides: A Novel Method Dokl. Acad. Sci. USSR, vol. 303, pp. 1508-1511, 1988.
[11] N. Peyret, P.A. Seneviratne, H.T. Allawi, and J. SantaLucia Jr., Nearest-Neighbor Thermodynamics and NMR of DNA Sequences with Internal A-A, C-C, G-G, and T-T Mismatches Biochemistry, vol. 38, pp. 3468-3477, 1999.
[12] P.A. Pevzner, l-Tuple DNA Sequencing: Computer Analysis J. Biomoleculular Structure&Dynamics, vol. 7, no. 1, pp. 63-73, 1989.
[13] P.A. Pevzner, Computational Molecular Biology: An Algorithmic Approach. MIT Press, 2000.
[14] P.A. Pevzner, Y.P. Lysov, K.R. Khrapko, A.V. Belyavsky, V.L. Florentiev, and A.D. Mirzabekov, Improved Chips for Sequencing by Hybridization J. Biomoleculular Structure&Dynamics, vol. 9, no. 2, pp. 399-410, 1991.
[15] F.P. Preparata, A.M. Frieze, and E. Upfal, On the Power of Universal Bases in Sequencing by Hybridization Proc. Third Ann. Int'l Conf' Computational Molecular Biology, pp. 295-301, Apr. 1999.
[16] F.P. Preparata and J.S. Oliver, DNA Sequencing-by-Hybridization Using Semidegenerate Bases J. Computational Biology, to appear.
[17] F.P. Preparata and E. Upfal, Sequencing-by-Hybridization at the Information-Theory Bound: An Optimal Algorithm J. Computational Biology, vol. 7, no. 3/4, pp. 621-630, 2000.
[18] F.P. Preparata, E. Upfal, and S.A. Heath, Sequence Reconstruction from Nucleic Acid Micro-Array Data Analytic Techniques for DNA Sequencing, B. Nunnally ed., M. Dekker, 2003.
[19] J.J. SantaLucia, A Unified View of Polymer, Dumbells, and Oligonucleotide DNA Nearest-Neighbor Thermodynamics Proc. Nat'l Academy of Science, vol. 95, pp. 1460-1465, 1998.
[20] M.S. Waterman, Introduction to Computational Biology. Chapman and Hall, 1995.

Index Terms:
DNA sequencing, sequencing-by-hybridization, microarrays, gapped probes, thermodynamics of hybridization, analog spectrum, semidegenerate bases.
Citation:
Franco P. Preparata, "Sequencing-by-Hybridization Revisited: The Analog-Spectrum Proposal," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 1, no. 1, pp. 46-52, Jan.-March 2004, doi:10.1109/TCBB.2004.12
Usage of this product signifies your acceptance of the Terms of Use.