2004 IEEE Computational Systems Bioinformatics Conference (CSB'04)
PRUNER: Algorithms for Finding Monad Patterns in DNA Sequences
Stanford, California
August 16-August 19
ISBN: 0-7695-2194-0
We present new algorithms for discovering monad patterns in DNA sequences. Monad patterns are of the form (l, d)-k, where l is the length of the pattern, d is the maximum number of mismatches allowed, and k is the minimum number of times the pattern is repeated in the given sample. The time-complexity of some of the best known algorithms to date is O(nt^2 l^d \left| \sum \right|^d ), where t is the number of input sequences, and n is the length of each input sequence. The first algorithm that we present takes O(n^2 t^2 l^{{d \mathord{\left/ {\vphantom {d 2}} \right. \kern-\nulldelimiterspace} 2}}\left| \Sigma \right|^{{d \mathord{\left/ {\vphantom {d 2}} \right. \kern-\nulldelimiterspace} 2}} ) and space 0(ntl^{{d \mathord{\left/ {\vphantom {d 2}} \right. \kern-\nulldelimiterspace} 2}} \left| \Sigma \right|^{{d \mathord{\left/ {\vphantom {d 2}} \right. \kern-\nulldelimiterspace} 2}} ), and the second algorithm takes 0(n^3 t^3 l^{{d \mathord{\left/ {\vphantom {d 2}} \right. \kern-\nulldelimiterspace} 2}} \left| \Sigma \right|^{{d \mathord{\left/ {\vphantom {d 2}} \right. \kern-\nulldelimiterspace} 2}} ) time using 0(l^{{d \mathord{\left/ {\vphantom {d 2}} \right. \kern-\nulldelimiterspace} 2}} \left| \Sigma \right|^{{d \mathord{\left/ {\vphantom {d 2}} \right. \kern-\nulldelimiterspace} 2}} ) space. In practice, our algorithms have much better performance provided the d/l ratio is small.
Index Terms:
Pattern discovery, regulatory patterns, k-mismatch patterns
Citation:
Ravi VijayaSatya, Amar Mukherjee, "PRUNER: Algorithms for Finding Monad Patterns in DNA Sequences," csb, pp.662-665, 2004 IEEE Computational Systems Bioinformatics Conference (CSB'04), 2004
Usage of this product signifies your acceptance of the
Terms of Use.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||