The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.03 - July-September (2007 vol.4)
pp: 496-505
ABSTRACT
In homology search, good spaced seeds have higher sensitivity for the same cost (weight). However, elucidating the mechanism that confers power to spaced seeds and characterizing optimal spaced seeds still remain unsolved. This paper investigates these two important open questions by formally analyzing the average number of non-overlapping hits and the hit probability of a spaced seed in the Bernoulli sequence model. We prove that when the length of a non-uniformly spaced seed is bounded above by an exponential function of the seed weight, the seed outperforms strictly the traditional consecutive seed of the same weight in both (i) the average number of non-overlapping hits and (ii) the asymptotic hit probability. This clearly answers the first problem mentioned above in the Bernoulli sequence model. The theoretical study in this paper also gives a new solution to finding long optimal seeds.
INDEX TERMS
Homology search, pattern matching, sequence alignment, spaced seeds, renewal theory, run statistics
CITATION
Louxin Zhang, "Superiority of Spaced Seeds for Homology Search", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.4, no. 3, pp. 496-505, July-September 2007, doi:10.1109/tcbb.2007.1013
19 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool