The Community for Technology Leaders
RSS Icon
Issue No.01 - January/February (2012 vol.9)
pp: 161-168
T. K. F. Wong , Dept. of Comput. Sci., Univ. of Hong Kong, Hong Kong, China
Y. S. Chiu , Dept. of Comput. Sci., Univ. of Hong Kong, Hong Kong, China
T. W. Lam , Dept. of Comput. Sci., Univ. of Hong Kong, Hong Kong, China
S. M. Yiu , Dept. of Comput. Sci., Univ. of Hong Kong, Hong Kong, China
In this paper, we consider the problem of structural alignment of a target RNA sequence of length n and a query RNA sequence of length m with known secondary structure that may contain simple pseudoknots or embedded simple pseudoknots. The best known algorithm for solving this problem runs in O(mn3) time for simple pseudoknot or O(mn4) time for embedded simple pseudoknot with space complexity of O(mn3) for both structures, which require too much memory making it infeasible for comparing noncoding RNAs (ncRNAs) with length several hundreds or more. We propose memory efficient algorithms to solve the same problem. We reduce the space complexity to O(n3) for simple pseudoknot and O(mn2 + n3) for embedded simple pseudoknot while maintaining the same time complexity. We also show how to modify our algorithm to handle a restricted class of recursive simple pseudoknot which is found abundant in real data with space complexity of O(mn2 + n3) and time complexity of O(mn4). Experimental results show that our algorithms are feasible for comparing ncRNAs of length more than 500.
Complexity theory, Bioinformatics, Memory management, Genomics, RNA, Computational biology, Heuristic algorithms,memory efficient algorithm., Noncoding RNAs, pseudoknot, structural alignment
T. K. F. Wong, Y. S. Chiu, T. W. Lam, S. M. Yiu, "Memory Efficient Algorithms for Structural Alignment of RNAs with Pseudoknots", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.9, no. 1, pp. 161-168, January/February 2012, doi:10.1109/TCBB.2011.66
[1] S. He, C. Liu, G. Skogerbo, H. Zhao, J. Wang, T. Liu, B. Bai, Y. Zhao, and R. Chen, “Noncode v2.0: Decoding the Non-Coding,” Nucleic Acids Research, vol. 36(Database), pp. D170-D172, 2007.
[2] S. Griffiths-Jones, A. Bateman, M. Marshall, A. Khanna, and S.R. Eddy, “Rfam: An RNA Family Database,” Nucleic Acids Research, vol. 31, no. 1, pp. 439-441, 2003.
[3] S.Y. Le, J.H. Chen, and J. Maizel, “Structure and Methods: Human Genome Initiative and DNA Recombination,” Chapter Efficient Searches for Unusual Folding Regions in RNA Sequences, vol. 1, pp. 127-130, 1990.
[4] E. Rivas and S. Eddy, “Secondary Structure Alone is Generally Not Statistically Significant for the Detection of Noncoding RNAs,” Bioinformatics, vol. 16, no. 7, pp. 583-605, 2000.
[5] R. Klein and S. Eddy, “Rsearch: Finding Homologs of Single Structured RNA Sequences,” BMC Bioinformatics, vol. 4, article 44, 2003.
[6] S. Zhang, B. Hass, E. Eskin, and V. Bafna, “Searching Genomes for Noncoding RNA Using FastR,” IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 2, no. 4, pp. 366-379, Oct.-Dec. 2005.
[7] E.P. Nawrocki and S.R. Eddy, “Query-Dependent Banding (QDB) for Faster RNA Similarity Searches,” PLoS Computational Biology, vol. 3, p. 15, 2007.
[8] J. Hen and C.W. Greider, “Functional Analysis of the Pseudoknot Structure in Human Telomerase RNA,” Proc. Nat'l Academy of Sciences USA, vol. 102, no. 23, pp. 8080-8085, 2005.
[9] E. Dam, K. Pleij, and D. Draper, “Structural and Functional Aspects of RNA Pseudoknots,” Biochemistry, vol. 31, no. 47, pp. 11665-11676, 1992.
[10] P.L. Adams, M.R. Stahley, A.B. Kosek, J. Wang, and S.A. Strobel, “Crystal Structure of a Self-Splicing Group I Intron with Both Exons,” Nature, vol. 430, pp. 45-50, 2004.
[11] H. Matsui, K. Sato, and Y. Sakakibara, “Pair Stochastic Tree Adjoining Grammars for Aligning and Predicting Pseudoknot RNA Structures,” Bioinformatics, vol. 21, pp. 2611-2617, 2005.
[12] B. Han, B. Dost, V. Bafna, and S. Zhang, “Structural Alignment of Pseudoknotted RNA,” J. Computational Biology, vol. 15, no. 5, pp. 489-504, 2008.
[13] S.R. Eddy, “A Memory-Efficient Dynamic Programming Algorithm for Optimal Alignment of a Sequence to an RNA Secondary Structure,” BMC Bioinformatics, vol. 3, article 18, 2002.
[14] Y. Song, C. Liu, X. Huang, R.L. Malmberg, Y. Xu, and L. Cai, “Efficient Parameterized Algorithms for Biopolymer Structure-Sequence Alignment,” IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 3, no. 4, pp. 423-432, Oct.-Dec. 2006.
[15] D.S. Hirschberg, “A Linear Space Algorithm for Computing Maximal Common Subsequences,” Comm. ACM, vol. 18, no. 6, pp. 341-343, 1975.
35 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool