The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.06 - Nov.-Dec. (2012 vol.9)
pp: 1629-1638
C. Ma , Dept. of Comput. Sci., Univ. of Hong Kong, Hong Kong, China
T. K. F. Wong , Dept. of Comput. Sci., Univ. of Hong Kong, Hong Kong, China
T. W. Lam , Dept. of Comput. Sci., Univ. of Hong Kong, Hong Kong, China
W. K. Hon , Dept. of Comput. Sci., Nat. Tsing Hua Univ., Hsinchu, Taiwan
K. Sadakane , Nat. Inst. of Inf., Tokyo, Japan
S. M. Yiu , Dept. of Comput. Sci., Univ. of Hong Kong, Hong Kong, China
ABSTRACT
Structural alignment has been shown to be an effective computational method to identify structural noncoding RNA (ncRNA) candidates as ncRNAs are known to be conserved in secondary structures. However, the complexity of the structural alignment algorithms becomes higher when the structure has pseudoknots. Even for the simplest type of pseudoknots (simple pseudoknots), the fastest algorithm runs in O(mn3) time, where m, n are the length of the query ncRNA (with known structure) and the length of the target sequence (with unknown structure), respectively. In practice, we are usually given a long DNA sequence and we try to locate regions in the sequence for possible candidates of a particular ncRNA. Thus, we need to run the structural alignment algorithm on every possible region in the long sequence. For example, finding candidates for a known ncRNA of length 100 on a sequence of length 50,000, it takes more than one day. In this paper, we provide an efficient algorithm to solve the problem for simple pseudoknots and it is shown to be 10 times faster. The speedup stems from an effective pruning strategy consisting of the computation of a lower bound score for the optimal alignment and an estimation of the maximum score that a candidate can achieve to decide whether to prune the current candidate or not.
INDEX TERMS
Bioinformatics, Genomics, RNA, Computational biology, Complexity theory, Algorithm design and analysis, Heuristic algorithms,structural alignment, Noncoding RNAs, pseudoknot
CITATION
C. Ma, T. K. F. Wong, T. W. Lam, W. K. Hon, K. Sadakane, S. M. Yiu, "An Efficient Alignment Algorithm for Searching Simple Pseudoknots over Long Genomic Sequence", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.9, no. 6, pp. 1629-1638, Nov.-Dec. 2012, doi:10.1109/TCBB.2012.104
REFERENCES
[1] P.P. Gardner, J. Daub, J. Tate, B.L. Moore, I.H. Osuch, S. Griffiths-Jones, R.D. Finn, E.P. Nawrocki, D.L. Kolbe, S.R. Eddy, A. Bateman, “Rfam: Wikipedia, Clans and the Decimal Release,” Nucleic Acids Research, vol. 39 (Database), pp. D141-D145, 2010.
[2] S.Y. Le, J.H. Chen, and J. Maizel, “Efficient Searches for Unusual Folding Regions in RNA Sequences,” Structure and Methods: Human Genome Initiative and DNA Recombination, vol. 1, pp. 127-130, Adenine Pr, 1990.
[3] P. Clote, F. Ferré, E. Kranakis, and D. Krizanc, “Structural RNA has Lower Folding Energy than Random RNA of the Same Dinucleotide Frequency,” RNA, vol. 11, pp. 578-591, 2005.
[4] E. Rivas and S. Eddy, “Secondary Structure Alone is Generally Not Statistically Significant for the Detection of Noncoding RNAs,” Bioinformatics, vol. 16, no. 7, pp. 583-605, 2000.
[5] R. Klein and S. Eddy, “Research: Finding Homologs of Single Structured RNA Sequences,” BMC Bioinformatics, vol. 4, article 44, 2003.
[6] S. Zhang, B. Hass, E. Eskin, and V. Bafna, “Searching Genomes for Noncoding RNA Using FastR,” IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 2, no. 4, pp. 366-379, Oct.-Dec. 2005.
[7] E.P. Nawrocki and S.R. Eddy, “Query-Dependent Banding (QDB) for Faster RNA Similarity Searches,” PLoS Computational Biology, vol. 3, p. e56, 2007.
[8] J. Hen and C.W. Greider, “Functional Analysis of the Pseudoknot Structure in Human Telomerase RNA,” Proc. Nat'l Academy of Sciences USA, vol. 102, no. 23, pp. 8080-8085, 2005.
[9] E. Dam, K. Pleij, and D. Draper, “Structural and Functional Aspects of RNA Pseudoknots,” Biochemistry, vol. 31, no. 47, pp. 11665-11676, 1992.
[10] P.L. Adams, M.R. Stahley, A.B. Kosek, J. Wang, and S.A. Strobel, “Crystal Structure of a Self-Splicing Group I Intron with Both Exons,” Nature, vol. 430, pp. 45-50, 2004.
[11] J.L. Chen and C.W. Greider, “Functional Analysis of the Pseudoknot Structure in Human Telomerase RNA,” Proc. Nat'l Academy of Sciences USA, vol. 102, no. 23, pp. 8080-8085; discussion 8077-8079, June 2005.
[12] H. Matsui, K. Sato, and Y. Sakakibara, “Pair Stochastic Tree Adjoining Grammars for Aligning and Predicting Pseudoknot RNA Structures,” Bioinformatics, vol. 21, pp. 2611-2617, 2005.
[13] B. Han, B. Dost, V. Bafna, and S. Zhang, “Structural Alignment of Pseudoknotted RNA,” J. Computational Biology, vol. 15, no. 5, pp. 489-504, 2008.
[14] Y. Song, C. Liu, X. Huang, R.L. Malmberg, Y. Xu, and L. Cai, “Efficient Parameterized Algorithms for Biopolymer Structure-Sequence Alignment,” IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 3, no. 4, pp. 423-432, Oct.-Dec. 2006.
[15] Z. Huang, Y. Wu, J. Robertson, L. Feng, R. Malmberg, and L. Cai, “Fast and Accurate Search for Non-Coding RNA Pseudoknot Structures in Genomes,” Bioinformatics, vol. 24, no. 20, pp. 2281-2287, 2008.
91 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool