CSDL Home IEEE/ACM Transactions on Computational Biology and Bioinformatics 2009 vol.6 Issue No.02 - April-June

Subscribe

Issue No.02 - April-June (2009 vol.6)

pp: 190-199

Shibin Qiu , Pathwork Diagnostics, Inc., Sunnyvale

Terran Lane , University of New Mexico, Albuquerque

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TCBB.2008.139

ABSTRACT

The cell defense mechanism of RNA interference has applications in gene function analysis and promising potentials in human disease therapy. To effectively silence a target gene, it is desirable to select appropriate initiator siRNA molecules having satisfactory silencing capabilities. Computational prediction for silencing efficacy of siRNAs can assist this screening process before using them in biological experiments. String kernel functions, which operate directly on the string objects representing siRNAs and target mRNAs, have been applied to support vector regression for the prediction and improved accuracy over numerical kernels in multidimensional vector spaces constructed from descriptors of siRNA design rules. To fully utilize information provided by string and numerical data, we propose to unify the two in a kernel feature space by devising a multiple kernel regression framework where a linear combination of the kernels is used. We formulate the multiple kernel learning into a quadratically constrained quadratic programming (QCQP) problem, which although yields global optimal solution, is computationally demanding and requires a commercial solver package. We further propose three heuristics based on the principle of kernel-target alignment and predictive accuracy. Empirical results demonstrate that multiple kernel regression can improve accuracy, decrease model complexity by reducing the number of support vectors, and speed up computational performance dramatically. In addition, multiple kernel regression evaluates the importance of constituent kernels, which for the siRNA efficacy prediction problem, compares the relative significance of the design rules. Finally, we give insights into the multiple kernel regression mechanism and point out possible extensions.

INDEX TERMS

Multiple kernel learning, multiple kernel heuristics, support vector regression, QCQP optimization, RNA interference, siRNA efficacy.

CITATION

Shibin Qiu, Terran Lane, "A Framework for Multiple Kernel Support Vector Regression and Its Applications to siRNA Efficacy Prediction",

*IEEE/ACM Transactions on Computational Biology and Bioinformatics*, vol.6, no. 2, pp. 190-199, April-June 2009, doi:10.1109/TCBB.2008.139REFERENCES

- [2] E. Check, “Hopes Rise for RNA Therapy as Mouse Study Hits Target,”
Nature, vol. 432, p.136, 2004.- [12] S. Qiu, T. Lane, and L. Buturovic, “A Randomized String Kernel and Its Applications to RNA Interference,”
Proc. 22nd AAAI Conf. Artificial Intelligence, pp.627-632, July 2007.- [14] P. Jia, T. Shi, Y. Cai, and Y. Li, “Demonstration of Two Novel Methods for Predicting Functional siRNA Efficiency,”
BMC Bioinformatics, vol. 7, p.271, 2006.- [15] J.-P. Vert, N. Foveau, C. Lajaunie, and Y. Vandenbrouck, “An Accurate and Interpretable Model for siRNA Efficacy Prediction,”
MBC Bioinformatics, vol. 7, p.520, 2006.- [16] V.N. Vapnik,
Statistical Learning Theory. John Wiley and Sons, 1998.- [19] S. Qiu and T. Lane, “RNA String Kernels for RNAi Off-Target Evaluation,”
Int'l J. Bioinformatics Research and Applications (IJBRA), vol. 2, no. 2, pp.132-146, 2006.- [20] G.R.G. Lanckriet, N. Cristianini, P. Bartlett, L.E. Ghaoui, and M.I. Jordan, “Learning the Kernel Matrix with Semidefinite Programming,”
J. Machine Learning Research, vol. 5, pp.27-72, 2004.- [21] N. Cristianini, J. Shawe-Taylor, A. Elissee, and J. Kandola, “On Kernel-TargetAlignment,”
Advances in Neural Information Processing Systems, T. Dietterich, S. Becker, and Z. Ghahramani, eds., vol.14, MIT Press, 2002.- [22] A. Smola and B. Schölkopf, “A Tutorial on Support Vector Regression,” Technical Report NC2-TR-1998-030, NeuroCOLT2, 1998.
- [23] J. Weston, B. Schölkopf, E. Eskin, C. Leslie, and W.S. Noble, “A Kernel Approach for Learning from Almost Orthogonal Patterns,”
Proc. Sixth European Conf. Principles and Practice of Knowledge Discovery in Databases (PKDD'02), Aug. 2002.- [24] S. Qiu and T. Lane, “Multiple Kernel Learning for Support Vector Regression,” Technical Report TR-CS-2005-42, Computer Science Dept., The Univ. of New Mexico, 2005.
- [25] P. Indyk and R. Motwani, “Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality,”
Proc. 30th Ann. ACM Symp. Theory of Computing, pp.604-613, 1998.- [26] S.M. Elbashir, J. Martinez, A. Patkaniowska, W. Lendeckel, and T. Tuschl, “Functional Anatomy of siRNA for Mediating Efficient RNAi in Drosophila Melanogaster Embryo Lysate,”
The EMBO J., vol. 20, no. 23, pp.6877-6888, 2001.- [27] S. Saxena, Z.O. Jonsson, and A. Dutta, “Small RNAs with Imperfect Match to Endogenous mRNA Repress Translation,”
J.Biological Chemistry, vol. 278, no. 45, pp.44312-44319, 2003.- [31] C.-C. Chang and C.-J. Lin, LIBSVM: A Library for Support Vector Machines, http://www.csie.ntu.edu.tw/∼cjlin libsvm, 2001.
- [33] S. Qiu and T. Lane, “Parallel Computation of RBF Kernels for Support Vector Classifiers,”
Proc. Fifth SIAM Int'l Conf. Data Mining (SDM05), pp.334-345, Apr. 2005. |