Issue No. 06 - Nov.-Dec. (2012 vol. 9)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TCBB.2012.84
C. Boucher , Dept. of Comput. Sci., Colorado State Univ., Fort Collins, CO, USA
M. Omar , Dept. of Math., California Inst. of Technol., Pasadena, CA, USA
Given a set S of n strings, each of length ℓ, and a nonnegative value d, we define a center string as a string of length ` that has Hamming distance at most d from each string in S. The #CLOSEST STRING problem aims to determine the number of center strings for a given set of strings S and input parameters n, ℓ, and d. We show #CLOSEST STRING is impossible to solve exactly or even approximately in polynomial time, and that restricting #CLOSEST STRING so that any one of the parameters n, ℓ, or d is fixed leads to a fully polynomial-time randomized approximation scheme (FPRAS). We show equivalent results for the problem of efficiently sampling center strings uniformly at random (u.a.r.).
Hamming distance, Approximation methods, Polynomials, Approximation algorithms, Bioinformatics, Computational biology, Sequential analysis,computational complexity, Biological sequence analysis, motif recognition, fully polynomial-time randomized approximation scheme (FPRAS), journal, fully polynomial almost uniform sampler (FPAUS)
C. Boucher, M. Omar, "On the Hardness of Counting and Sampling Center Strings", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 9, no. , pp. 1843-1846, Nov.-Dec. 2012, doi:10.1109/TCBB.2012.84