This Article 
 Bibliographic References 
 Add to: 
Discovery of Structural and Functional Features in RNA Pseudoknots
July 2009 (vol. 21 no. 7)
pp. 974-984
Qingfeng Chen, Deakin University, Australia
Yi-Ping Phoebe Chen, Deakin University and ARC Centre of Excellence in Bioinformatics, Australia
An RNA pseudoknot consists of nonnested double-stranded stems connected by single-stranded loops. There is increasing recognition that RNA pseudoknots are one of the most prevalent RNA structures and fulfill a diverse set of biological roles within cells, and there is an expanding rate of studies into RNA pseudoknotted structures as well as increasing allocation of function. These not only produce valuable structural data but also facilitate an understanding of structural and functional characteristics in RNA molecules. PseudoBase is a database providing structural, functional, and sequence data related to RNA pseudoknots. To capture the features of RNA pseudoknots, we present a novel framework using quantitative association rule mining to analyze the pseudoknot data. The derived rules are classified into specified association groups regarding structure, function, and category of RNA pseudoknots. The discovered association rules assist biologists in filtering out significant knowledge of structure-function and structure-category relationships. A brief biological interpretation to the relationships is presented, and their potential correlations with each other are highlighted.

[1] D.P. Aalberts and N.O. Hodas, “Asymmetry in RNA Pseudoknots: Observation and Theory,” Nucleic Acids Research, vol. 33, no. 7, pp.2210-2214, 2005.
[2] D.P. Aalberts, J.M. Parman, and N.L. Goddard, “Single-Strand Stacking Free Energy from DNA Beacon Kinetics,” Biophysical J., vol. 84, pp.3212-3217, 2003.
[3] P.L. Adams, M.R. Stahley, A.B. Kosek, J. Wang, and S.A. Strobel, “Crystal Structure of a Self-Splicing Group I Intron with Both Exons,” Nature, vol. 430, no. 6995, pp.45-50, 2004.
[4] K. Bjarne and H. Jotun, “RNA Secondary Structure Prediction Using Stochastic Context-Free Grammars and Evolutionary History,” Bioinformatics, vol. 15, no. 6, pp.446-454, 1999.
[5] J.L. Chen and C.W. Greiger, “Functional Analysis of the Pseudoknot Structure in Human Telomerase RNA,” Proc. Nat'l Academy of Sciences USA, vol. 102, no. 23, pp.8080-8085, 2005.
[6] Q.F. Chen and Y.P.P. Chen, “Mining Frequent Patterns for AMP-activated Protein Kinase Regulation on Skeletal Muscle,” BMC Bioinformatics, vol. 7, pp. 1-14, 2006.
[7] Q.F. Chen, Y.P.P. Chen, and C.Q. Zhang, “Detecting Inconsistency in Biological Molecular Databases using Ontology,” Data Mining and Knowledge Discovery, vol. 15, pp.275-296, 2007.
[8] G. Cong, K.L. Tan, K.H. Anthony, T. Xin, and X. Xu, “Mining Top-K Covering Rule Groups for Gene Expression Data,” Proc. 2005 ACM SIGMOD Int'l Conf. Management of Data, pp.670-681, 2005.
[9] G. Coope, “A Simple Constraint-Based Algorithm for Efficiently Mining Observational Databases for Causal Relationships,” Data Mining and Knowledge Discovery, vol. 1, no. 2, pp.203-224, 1997.
[10] W.S. David and E.B. Samuel, “Pseudoknots: RNA Structures with Diverse Functions.” PloS Biology, vol. 3, no. 6, pp.956-959, 2005.
[11] W.K. Dawson, K. Fujiwara, and G. Kawai, “Prediction of RNA Pseudoknots Using Heuristic Modeling with Mapping and Sequential Folding,” PLoS ONE, vol. 2, no. 9, 2007.
[12] B.A.L.M. Deiman, R.M. Kortlever, and C.W.A. Pleij, “The Role of the Pseudoknot at the $3^{\prime }$ End of Turnip Yellow Virus RNA in Minus-Strand Synthesis by the Viral RNA-Dependent RNA Polymerse,” J. Virology, pp.5990-5996, 1997.
[13] R.M. Dirks and N.A. Pierce, “A Partition Function Algorithm for Nucleic Acid Secondary Structure Including Pseudoknots,” J.Computational Chemistry, vol. 24, pp.1664-1677, 2003.
[14] S. Freier, R. Kierzek, J. Jaeger, N. Sugimoto, M. Caruthers, T. Neilson, and D. Turner, “Improved Free-Energy Parameters for Predictions of RNA Duplex Stability,” Proc. Nat'l Academy of Sciences USA, vol. 83, no. 24, pp.9373-9377, 1986.
[15] P. Gardner and R. Giegeric, “A Comprehensive Comparison of Comparative RNA Structure Prediction Approaches,” BMC Bioinformatics, vol. 5, pp.140, 2004.
[16] J.W. Han, Y. Cai, and N. Cercon, “Data-Driven Discovery of Quantitative Rules in Relational Databases,” IEEE Trans. Knowledge and Data Eng., vol. 5, no. 1, pp.29-40, Feb. 1993.
[17] J.W. Han, J. Pei, and Y. Yin, “Mining Frequent Patterns Without Candidate Generation,” Proc. ACM SIGMOD Int'l Conf. Management of Data, pp.1-12, 2000.
[18] R.B. Lyngso and C.N. Pedersen, “RNA Pseudoknot Prediction in Energy-Based Models,” J. Computational Biology, vol. 7, no. 3, pp.409-427, 2000.
[19] R.B. Lyngso, M. Zuker, and C.N.S. Pederse, “Fast Evaluation of Internal Loops in RNA Secondary Structure Prediction,” Bioinformatics, vol. 15, no. 6, pp.440-445, 1999.
[20] S. Napthine, J. Liphardt, A. Bloys, S. Routledge, and I. Brierley, “The Role of RNA Pseidoknot Stem 1 Length in the Promotion of Efficient—1 Ribosomal Frameshifting,” J. Molecular Biology, vol. 288, pp.305-320, 1999.
[21] P.L. Nixon and D.P. Giedroc, “Energetics of a Strongly pH Dependent RNA Tertiary Structure in a Frameshifting Pseudoknot,“ J. Molecular Biology, vol. 296, pp.659-671, 2000.
[22] J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann Publishers, 1988.
[23] C.W. Pleij, K. Rietveld, and L. Bosch, “A New Principle of RNA Folding Based on Pseudoknotting,” Nucleic Acids Research, vol. 13, no. 5, pp.1717-1731, 1985.
[24] T. Rastogi, T.L. Beattie, J.E. Olive, and R.A. Collin, “A Long-Range Pseudoknot is Required for Activity of the Neurospora VS Ribozyme,” EMBO J., vol. 15, no. 11, pp.2820-2825, 1996.
[25] K. Rietveld, R. Van Poelgeest, C.W. Pleij, J.H. Van Boom, and L. Bosch, “The tRNA-Like Structure at the $3^{\prime }$ Terminus of Turnip Yellow Mosaic Virus RNA. Differences and Similarities with Canonical tRNA,” Nucleic Acids Research, vol. 10, pp.1929-1946, 1982.
[26] E. Rivas and S.R. Edd, “The Language of RNA: a Formal Grammar that Includes Pseudoknots,” Bioinformatics, vol. 16, no. 4, pp.334-340, 2000.
[27] J. Ruan, G.D. Stormo, and W. Zhang, “An Iterated Loop Matching Approach to the Prediction of RNA Secondary Structures with Pseudoknots,” Bioinformatics, vol. 20, no. 1, pp.58-66, 2004.
[28] U. Scheffer, T. Okamoto, J.M.S. Forrest, P.G. Rytik, W.E.G. Muller, and H.C. Schrode, “Interaction of 68-kDa TAR RNA-Binding Protein and Other Cellular Proteins with Prion Protein-RNA Stem-Loop,” J. Neurovirology, vol. 1, pp.391-398, 1995.
[29] Y. Links Shamoo, A. Tam, W.H. Konigsberg, and K.R. Williams, “Translational Repression by the Bacteriophage T4 Gene 32 Protein Involves Specific Recognition of an RNA Pseudoknot Structure,” J. Molecular Biology, vol. 232, no. 1, pp.89-104, 1993.
[30] L.X. Shen and I. Tinoc, “The Structure of an RNA Pseudoknot that Causes Efficient Frameshifting in Mouse Mammary Tumor Virus,” J. Molecular Biology, vol. 247, no. 5, pp.963-978, 1995.
[31] R. Srikant and R. Agrawal, “Mining Quantitative Association Rules in Large Relational Tables,” Proc. 1996 ACM SIGMOD Int'l Conf. Management of Data, pp.1-12, 1996.
[32] D.W. Staple and S.E. Butche, “Pseudoknots: RNA Structures with Diverse Functions,” PLoS Biology, vol. 3, no. 6, pp. 956-959, 2005.
[33] S.A. Strobe, “Biochemical Identification of A-Minor Motifs Within RNA Tertiary Structure by Interference Analysis,” Biochemical Soc. Trans., vol. 30, pp.1126-1131, 2002.
[34] E.B. ten Dam, C.W.A. Pleij, and L. Bosch, “RNA Pseudoknots: Translational Frameshifting and Readthrough on Viral RNAs,” Virus Genes, vol. 4, pp.121-136, 1990.
[35] I. Tinoco, P.N. Borer, B. Dengler, M.D. Levine, O.C. Uhlenbeck, D.M. Crothers, and J. Grall, “Improved Estimation of Secondary Structure in Ribonucleic Acids,” Nature New Biology, vol. 246, no. 150, pp.40-41, 1973.
[36] F.H. van Batenburg, A.P. Gultyaev, and C.W. Plei, “PseudoBase: Structural Information on RNA Pseudoknots,” Nucleic Acids Research, vol. 29, no. 1, pp.194-195, 2001.
[37] F.H. van Batenburg, A.P. Gultyaev, C.W. Pleij, J. Ng, and J. Iliehoek, “PseudoBase: A Database with RNA Pseudoknots,” Nucleic Acids Research, vol. 28, no. 1, pp.201-204, 2000.
[38] X. Xu, Y. Ji, and G.D. Stormo, “RNA Sampler: A New Sampling Based Algorithm for Common RNA Secondary Structure Prediction and Structural Alignment,” Bioinformatics, vol. 23, no. 15, pp.1883-1891, 2007.
[39] C.Q. Zhang and S.C. Zhang, Association Rule Mining: Models and Algorithms. Springer-Verlag, 2002.
[40] M. Zuke, “On Finding all Suboptimal Foldings of an RNA Molecule,” Science, vol. 244, no. 4900, pp.48-52, 1989.
[41] M. Zuker and P. Stiegler, “Optimal Computer Folding of Large RNA Sequences Using Thermodynamics and Auxiliary Information,” Nucleic Acids Research, vol. 9, no. 1, pp.133-48, 1981.

Index Terms:
RNA pseudoknots, stem, loop, association rule mining, PseudoBase, H-pseudoknot, function, structure, partition.
Qingfeng Chen, Yi-Ping Phoebe Chen, "Discovery of Structural and Functional Features in RNA Pseudoknots," IEEE Transactions on Knowledge and Data Engineering, vol. 21, no. 7, pp. 974-984, July 2009, doi:10.1109/TKDE.2008.231
Usage of this product signifies your acceptance of the Terms of Use.