The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.06 - Nov.-Dec. (2012 vol.9)
pp: 1843-1846
C. Boucher , Dept. of Comput. Sci., Colorado State Univ., Fort Collins, CO, USA
M. Omar , Dept. of Math., California Inst. of Technol., Pasadena, CA, USA
ABSTRACT
Given a set S of n strings, each of length ℓ, and a nonnegative value d, we define a center string as a string of length ` that has Hamming distance at most d from each string in S. The #CLOSEST STRING problem aims to determine the number of center strings for a given set of strings S and input parameters n, ℓ, and d. We show #CLOSEST STRING is impossible to solve exactly or even approximately in polynomial time, and that restricting #CLOSEST STRING so that any one of the parameters n, ℓ, or d is fixed leads to a fully polynomial-time randomized approximation scheme (FPRAS). We show equivalent results for the problem of efficiently sampling center strings uniformly at random (u.a.r.).
INDEX TERMS
Hamming distance, Approximation methods, Polynomials, Approximation algorithms, Bioinformatics, Computational biology, Sequential analysis,computational complexity, Biological sequence analysis, motif recognition, fully polynomial-time randomized approximation scheme (FPRAS), journal, fully polynomial almost uniform sampler (FPAUS)
CITATION
C. Boucher, M. Omar, "On the Hardness of Counting and Sampling Center Strings", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.9, no. 6, pp. 1843-1846, Nov.-Dec. 2012, doi:10.1109/TCBB.2012.84
REFERENCES
[1] A. Ben-Dor, G. Lancia, J. Perone, and R. Ravi, “Banishing Bias from Consensus Strings,” Proc. Eighth Ann. Symp. Combinatorial Pattern Matching (CPM), pp 247-261, 1997.
[2] C. Boucher and D.G. Brown, “Detecting Motifs in a Large Data Set: Applying Probabilistic Insights to Motif Finding,” Proc. First Conf. Bioinformatics and Computational Biology (BICoB), pp. 139-150, 2008.
[3] X. Deng, G. Li, Z. Li, B. Ma, and L. Wang, “Genetic Design of Drugs without Side-Effects,” SIAM J. Computing, vol. 32, no. 4, pp. 1073-1090, 2003.
[4] J. Dopazo, A. Rodríguez, J.C. Sáiz, and F. Sobrino, “Design of Primers for PCR Amplification of Highly Variable Genomes,” Computer Applications in the Biosciences, vol. 9, pp. 123-125, 1993.
[5] M. Dyer, “Approximate Counting by Dynamic Programming,” Proc. 35th Ann. ACM Symp. Theory of Computing (STOC), pp. 693-699, 2003.
[6] M. Dyer and A. Frieze, “Randomly Colouring Graphs with Lower Bounds on Girth and Maximum Degree,” Proc. IEEE 42nd Symp. Foundations of Computer Science (FOCS), pp. 579-587, 2001.
[7] M. Dyer, A. Frieze, and M. Jerrum, “Approximately Counting Hamilton Paths and Cycles in Dense Graphs,” SIAM J. Computing, vol. 27, no. 5, 1262-1272, 1998.
[8] M. Dyer, A. Frieze, and M. Jerrum, “On Counting Independent Sets in Sparse Graphs,” SIAM J. Computing, vol. 31, no. 5, pp. 1527-1541, 2002.
[9] M.R. Fellows, J. Gramm, and R. Neidermeier, “On the Parameterized Intractability of Closest Substring and Related Problems,” Proc. 19th Ann. Symp. Theoretical Aspects of Computer Science (STACS), pp. 262-273, 2002.
[10] M.R. Fellows, J. Gramm, and R. Niedermeier, “On the Parameterized Intractability of Motif Search Problems,” Combinatorica, vol. 26, pp. 141-167, 2006.
[11] M. Frances and A. Litman, On Covering Problems of Codes,” Theoretical Computer Science, vol. 30, no. 2, pp. 113-119, 1997.
[12] J. Gramm, R. Niedermeier, and P. Rossmanith, “Fixed-Parameter Algorithms for Closest String and Related Problems,” Algorithmica, vol. 37, no. 1, pp. 25-42, 2003.
[13] T.P. Hayes and E. Vigoda, “A Non-Markovian Coupling for Randomly Sampling Colorings,” Proc. IEEE 44th Ann. Symp. Foundations of Computer Science (FOCS), pp. 618-627, 2003.
[14] M.R. Jerrum and A. Sinclair, “Approximating the Permanent,” SIAM J. Computing, vol. 18, no. 6, pp. 1149-1178, 1989.
[15] M.R. Jerrum, L.G. Valiant, and V. Vazirani, “Random Generation of Combinatorial Structures from a Uniform Distribution,” Theoretical Computer Science, vol. 43, pp. 169-188, 1986.
[16] J.K. Lanctot, M. Li, B. Ma, S. Wang, and L. Zhang, “Distinguishing String Selection Problems,” Information and Computation, vol. 185, pp. 41-55, 2003.
[17] M. Li, B. Ma, and L. Wang, “Finding Similar Regions in Many Strings,” J. Computer and System Sciences, vol. 65, no. 1, pp. 73-96, 2002.
[18] K. Lucas, M. Busch, S. Össinger, and J.A. Thompson, “An Improved Microcomputer Program for Finding Gene- and Gene Family-Specific Oligonucleotides Suitable as Primers for Polymerase Chain Reactions or as Probes,” Computer Applications in the Biosciences, vol. 7, pp. 525-529, 1991.
[19] B. Ma, “A Polynomial Time Approximation Scheme for the Closest Substring Problem,” Proc. 11th Ann. Symp. Combinatorial Pattern Matching (CPM), pp. 99-107, 2000.
[20] B. Ma and X. Sun, “More Efficient Algorithms for Closest String and Substring Problems,” Proc. 12th Ann. Int'l Conf. Research in Computational Molecular Biology (RECOMB), pp. 396-409, 2008.
[21] M. Molloy, “The Glauber Dynamics on Colorings of a Graph with High Girth and Maximum Degree,” Proc. 36th Ann. ACM Symp. Theory of Computing (STOC), pp. 91-98, 2002.
[22] B. Morris and A. Sinclair, “Random Walks on Truncated Cubes and Sampling 0-1 Knapsack Solutions,” Proc. 40th Ann. Symp. Foundations of Computer Science (FOCS), pp. 230-240, 1999.
[23] R. Motwani and P. Raghavan, Randomized Algorithms. Cambridge Univ. Press, 1995.
[24] G. Pavesi, G. Mauri, and G. Pesole, “An Algorithm for Finding Signals of Unknown Length in DNA Sequences,” Bioinformatics, vol. 17, pp. S207-S214, 2001.
[25] P. Pevzner and S. Sze, “Combinatorial Approaches to Finding Subtle Signals in DNA Strings,” Proc. Eighth Int'l Conf. Intelligent Systems for Molecular Biology (ISMB), pp. 269-278, 2000.
[26] V. Proutski and E.C. Holme, “Primer Master: A New Program for the Design and Analysis of PCR Primers,” Computer Applications in the Biosciences, vol. 12, pp. 253-255, 1996.
[27] M. Tompa et al., “Assessing Computational Tools for the Discovery of Transcription Factor Binding Sites,” Nature Biotechnology, vol. 23, no. 1, pp. 137-144, 2005.
82 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool