2010 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines (2010)
Charlotte, North Carolina, USA
May 2, 2010 to May 4, 2010
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/FCCM.2010.21
Understanding the structure and function of DNA sequences represents an important area of research in modern biology. Unfortunately, analysis of such data is often complicated by the presence of mutations introduced by evolutionary processes. At the lowest scale, these usually occur in biological sequences as character substitutions, insertions or deletions (indel). They increase the time-complexity of algorithms for sequence analysis by introducing an element of uncertainty, complicating their practical usage. One class of such algorithms has been designed to search for tandem repeats with possible errors - approximate tandem repeats. This paper investigates the possibilities for hardware acceleration of approximate tandem repeat searching and describes a parametrized architecture suitable for chips with FPGA technology. The proposed architecture is able to detect tandems with both types of errors (mismatches and indels) and does not limit the length of detected tandem. A prototype of the circuit was implemented in VHDL language and synthesized for Virtex5 technology. Application on test sequences shows that the circuit is able to speed up tandem searching in orders of thousands in comparison with the best-known software method relying on suffix arrays.
Approximate tandem repeat, dynamic programming, systolic array, FPGA, DNA
M. Lexa and T. Martínek, "Hardware Acceleration of Approximate Tandem Repeat Detection," 2010 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines(FCCM), Charlotte, North Carolina, USA, 2010, pp. 79-86.