The Community for Technology Leaders
RSS Icon
Issue No.06 - November/December (2011 vol.8)
pp: 1535-1544
Qingfeng Chen , Guangxi University, Nanning and La Trobe University, Melbourne
Many raw biological sequence data have been generated by the human genome project and related efforts. The understanding of structural information encoded by biological sequences is important to acquire knowledge of their biochemical functions but remains a fundamental challenge. Recent interest in RNA regulation has resulted in a rapid growth of deposited RNA secondary structures in varied databases. However, a functional classification and characterization of the RNA structure have only been partially addressed. This article aims to introduce a novel interval-based distance metric for structure-based RNA function assignment. The characterization of RNA structures relies on distance vectors learned from a collection of predicted structures. The distance measure considers the intersected, disjoint, and inclusion between intervals. A set of RNA pseudoknotted structures with known function are applied and the function of the query structure is determined by measuring structure similarity. This not only offers sequence distance criteria to measure the similarity of secondary structures but also aids the functional classification of RNA structures with pesudoknots.
Pseudoknot, function, stem, loop, similarity, distance, structure.
Qingfeng Chen, "Function Annotation for Pseudoknot Using Structure Similarity", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.8, no. 6, pp. 1535-1544, November/December 2011, doi:10.1109/TCBB.2011.50
[1] D.P. Aalberts and N.O. Hodas, “Asymmetry in RNA Pseudoknots: Observation and Theory,” Nucleic Acids Research, vol. 33, no. 7, pp. 2210-2214, 2005.
[2] D.P. Bartel, “MicroRNAs: Genomics, Biogenesis, Mechanism, and Function,” Cell, vol. 116, pp. 281-297, 2004.
[3] M.D. Carlos, M.W. Duarte Leven, and M.P. Anna, “RNA Structure Comparison, Motif Search and Discovery Using a Reduced Representation of RNA Conformational Space,” Nucleic Acids Research, vol. 31, no. 16, pp. 4755-4761, 2003.
[4] E. Capriotti and M.A. Marti-Renom, “SARA: A Server for Function Annotation of RNA Structures,” Nucleic Acids Research, vol. 37, pp. 260-265, 2009.
[5] K.P. Chan and A.W. Fu, “Efficient Time Series Matching by Wavelets,” Proc. Int'l Conf. Data Eng. (ICDE), pp. 126-133, 1999.
[6] Y.F. Chang, Y.L. Huang, and C.L. Lu, “SARSA: A Web Tool for Structural Alignment of RNA Using a Structural Alphabet,” Nucleic Acids Research, vol. 36, pp. 19-24, 2008.
[7] Y. Dorsett and T. Tuschl, “siRNAs: Applications in Functional Genomics and Potential as Therapeutics,” Nature Rev. Drug Discovery, vol. 3, pp. 318-329, 2004.
[8] J.A. Doudna, “Structural Genomics of RNA,” Nature Structural Biology, vol. 7, no. 11 (Suppl.), pp. 954-956, 2000.
[9] F. Erhard and R. Zimmer, “Classification of ncRNAs Using Position and Size Information in Deep Sequencing Data,” Bioinformatics, vol. 26, no. 18, pp. 426-432, 2010.
[10] C. Faloutsos, M. Ranganathan, and Y. Manolopoulos, “Fast Subsequence Matching in Time-Series Databases,” Proc. ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '94), pp. 419-429, 1994.
[11] F. Ferre, Y. Ponty, W.A. Lorenz, and P. Clote, “DIAL: A Web Server for the Pairwise Alignment of Two RNA Three-Dimensional Structures Using Nucleotide, Dihedral Angle and Base-Pairing Similarities,” Nucleic Acids Research, vol. 35, pp. 659-668, 2007.
[12] S.M. He, C.N. Liu, G. Skogerb, H.T. Zhao, J. Wang, T. Liu, B.Y. Bai, Y. Zhao, and R.S. Chen, “NONCODE v2.0: Decoding the Non-Coding,” Nucleic Acids Research, vol. 3, pp. 170-172, 2008.
[13] J.S. Mattick, “RNA Regulation: A New Genetics,” Nature Rev. Genetics, vol. 5, no. 4, pp. 316-23, 2004.
[14] K.C. Pang, S. Stephen, M.E. Dinger, P.G. Engström, B. Lenhard, and J.S. Mattick, “RNAdb 2.0-An Expanded Database of Mammalian Non-Coding RNAs,” Nucleic Acids Research, vol. 35, pp. D178-D182, 2007.
[15] M. Parisien and F. Major, “The MC-Fold and MC-Sym Pipeline Infers RNA Structure from Sequence Data,” Nature, vol. 452, no. 7183, pp. 51-55, 2008.
[16] C.W. Pleij, K. Rietveld, and L. Bosch, “A New Principle of RNA Folding Based on Pseudoknotting,” Nucleic Acids Research, vol. 13, no. 5, pp. 1717-1731, 1985.
[17] P.P. Gardner, J. Daub, J.G. Tate, E.P. Nawrocki, D.L. Kolbe, S. Lindgreen, A.C. Wilkinson, R.D. Finn, S. Griffiths-Jones, S.R. Eddy, and A. Bateman, “Rfam: Updates to the RNA Families Database,” Nucleic Acids Research, vol. 37, pp. 136-140, 2009.
[18] G. Rote, “Computing the Minimum Hausdorff Distance Between Two Point Sets on a Line under Translation,” Information Processing Letters, vol. 38, no. 3, pp. 123-127, 1991.
[19] J. Ruan, G.D. Stormo, and W.X. Zhang, “An Iterated Loop Matching Approach to the Prediction of RNA Secondary Structures with Pseudoknots,” Bioinformatics, vol. 20, no. 1, pp. 58-66, 2004.
[20] D.B. Searls, “Linguistic Approaches to Biological Sequences,” Bioinformatics, vol. 13, no. 4, pp. 333-344, 1997.
[21] D.B. Searls, “The Language of Genes,” Nature, vol. 420, pp. 211-217, 2002.
[22] L.X. Shen and I. Tinoco, “The Structure of an RNA Pseudoknot that Causes Efficient Frameshifting in Mouse Mammary Tumor Virus,” J. Molecular Biology, vol. 247, no. 5, pp. 963-978, 1995.
[23] D.W. Staple and S.E. Butcher, “Pseudoknots: RNA Structures with Diverse Functions,” PLoS Biology, vol. 3, no. 6, p. 2, 2005.
[24] E.B. ten Dam, C.W.A. Pleij, and L. Bosch, “RNA Pseudoknots: Translational Frameshifting and Readthrough on Viral RNAs,” Virus Genes, vol. 4, pp. 121-136, 1990.
[25] I. Tinoco, P.N. Borer, B. Dengler, M.D. Levine, O.C. Uhlenbeck, D.M. Crothers, and J. Gralla, “Improved Estimation of Secondary Structure in Ribonucleic Acids,” Nature New Biology, vol. 246, no. 150, pp. 40-41, 1973.
[26] F.H. van Batenburg, A.P. Gultyaev, and C.W. Pleij, “PseudoBase: Structural Information on RNA Pseudoknots,” Nucleic Acids Research, vol. 29, no. 1, pp. 194-5, 2001.
[27] F.H. van Batenburg, A.P. Gultyaev, C.W. Pleij, J. Ng, and J. Iliehoek, “PseudoBase: A Database with RNA Pseudoknots,” Nucleic Acids Research, vol. 28, no. 1, pp. 201-204, 2000.
[28] N.J. Wiebe and I.M. Meyer, “TRANSAT-Method for Detecting the Conserved Helices of Functional RNA Structures, Including Transient, Pseudo-Knotted and Alternative Structures,” PLoS Computational Biology, vol. 6, no. 6, pp. e1000823, 2010.
[29] X. Xu, Y. Ji, and G.D. Stormo, “Discovering Cis-Regulatory RNAs in Shewanella Genomes by Support Vector Machines,” PLoS Computational Biology vol. 5, no. 4,e1000338, 2009.
[30] S.J. Zhang, B. Haas, E. Eskin, and V. Bafna, “Searching Genomes for Noncoding RNA Using FastR,” IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 2, no. 4, pp. 366-379, Oct.-Dec. 2005.
[31] M. Zuker and P. Stiegler, “Optimal Computer Folding of Large RNA Sequences Using Thermodynamics and Auxiliary Information,” Nucleic Acids Research, vol. 9, no. 1, pp. 133-48, 1981.
15 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool