Intelligent Systems Design and Applications, International Conference on (2009)
Nov. 30, 2009 to Dec. 2, 2009
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ISDA.2009.41
The detection of an over-represented sub-sequence in a set of (carefully chosen) DNA sequences is often the main clue leading to the investigation of a possible functional role for such a subsequence. Over-represented substrings (with possibly local mutations) in a biological string are termed motifs. A typical functional unit that can be modeled by a motif is a Transcription Factor Binding Site (TFBS), a portion of the DNA sequence apt to the binding of a protein that participates in complex transcriptomic biochemical reactions. In the literature it has been proposed a simplified combinatorial problem called the planted (l-d)-motif problem (known also as the (l-d) Challenge Problem) that captures the essential combinatorial nature of the motif finding problem. In this paper we propose a novel graph-based algorithm for solving a refinement of the (l-d) Challenge Problem. Experimental results show that instances of the (l-d) Challenge Problem considered difficult for competing state of the art methods in literature can be solved efficiently in our framework.
motif finding problem, TFBS detection, (l-d) Challenge Problem, graph-based algorithm
M. Elena Renda, Filippo Geraci, Marco Pellegrini, "An Efficient Combinatorial Approach for Solving the DNA Motif Finding Problem", Intelligent Systems Design and Applications, International Conference on, vol. 00, no. , pp. 335-340, 2009, doi:10.1109/ISDA.2009.41