The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.02 - February (2008 vol.30)
pp: 348-353
ABSTRACT
With the newly-proposed Canonical Correlation Analysis (CCA) named NmCCA that is an alternative formulation of CCA for more than two views of the same phenomenon, we develop a new effective multiple kernel learning algorithm. First, we adopt the empirical kernels to map the input data into m different feature spaces corresponding to different kernels. Then through the incorporation of NmCCA in a learning algorithm, one single learning process based on the regularization learning is developed, where a special term called Inter-Function Similarity Loss RIFSL is introduced for the agreement of multi-view outputs. In implementation, we select the Modification of Ho-Kashyap algorithm with Squared approximation of the misclassification errors (MHKS) as the incorporated paradigm, and the experimental results on benchmark datasets demonstrate the feasibility and effectiveness of the proposed algorithm named MultiK-MHKS.
INDEX TERMS
Multiple kernel learning, Canonical correlation analysis, Modified Ho-Kashyap algorithm, Single learning process, Pattern recognition
CITATION
Zhe Wang, Songcan Chen, Tingkai Sun, "MultiK-MHKS: A Novel Multiple Kernel Learning Algorithm", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.30, no. 2, pp. 348-353, February 2008, doi:10.1109/TPAMI.2007.70786
REFERENCES
[1] F. Bach, G.R.G. Lanckriet, and M.I. Jordan, “Multiple Kernel Learning, Conic Duality, and the SMO Algorithm,” Proc. 21st Int'l Conf. Machine Learning, 2004.
[2] K.P. Bennett, M. Momma, and M.J. Embrechts, “MARK: A Boosting Algorithm for Heterogeneous Kernel Models,” Proc. ACM SIGKDD, pp. 24-31, 2002.
[3] J. Bi, T. Zhang, and K. Bennett, “Column-Generation Boosting Methods for Mixture of Kernels,” Proc. Int'l Conf. Knowledge Discovery and Data Mining, pp. 521-526, 2004.
[4] O. Chapelle, V. Vapnik, O. Bousquet, and S. Mukherjee, “Choosing Multiple Parameters for Support Vector Machines,” Machine Learning, vol. 46, no. 1-3, pp. 131-159, 2002.
[5] Z. Chen and S. Haykin, “On Different Facets of Regularization Theory,” Neural Computation, vol. 14, no. 12, pp. 1481-1497, 2002.
[6] N. Cristianini and J. Shawe-Taylor, An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge Univ. Press, 2000.
[7] I.M. de Diego, J.M. Moguerza, and A. Muñoz, “Combining Kernel Information for Support Vector Classification,” Proc. Fifth Int'l Workshop Multiple Classifier Systems, pp. 102-111, 2004.
[8] I.M. de Diego, J.M. Moguerza, and A. Muñoz, “On the Fusion of Polynomial Kernels for Support Vector Classifiers,” Proc. Int'l Conf. Intelligent Data Eng. and Automated Learning, pp. 330-337, 2006.
[9] J.D.R. Farquhar, D.R. Hardoon, H. Meng, J. Shawe-Taylor, and S. Szedmak, “Two View Learning: SVM-2K, Theory and Practice,” Neural Information Processing Systems, 2005.
[10] Y. Grandvalet and S. Canu, “Adaptive Scaling for Feature Selection in SVMs,” Neural Information Processing Systems, 2002.
[11] D. Hardoon, S. Szedmak, and J. Shawe-Taylor, “Canonical Correlation Analysis: An Overview with Application to Learning Methods,” Neural Computation, vol. 16, pp. 2639-2664, 2004.
[12] H. Hotelling, “Relations between Two Sets of Variates,” Biometrika, vol. 28, pp. 321-377, 1936.
[13] U. Kreßel, “Pairwise Classification and Support Vector Machines,” Advances in Kernel Methods: Support Vector Machine, B. Schölkopf, C. Burges, A. Somla, eds., pp. 255-268, MIT Press, 1998.
[14] G.R.G. Lanckriet, T.D. Bie, N. Cristianini, M.I. Jordan, and W.S. Noble, “A Statistical Framework for Genomic Data Fusion,” Bioinformatics, vol. 20, no. 16, pp. 2626-2635, 2004.
[15] G.R.G. Lanckriet, N. Cristianini, P. Bartlett, L.E. Ghaoui, and M.I. Jordan, “Learning the Kernel Matrix with Semidefinite Programming,” J. Machine Learning Research, vol. 5, pp. 27-72, 2004.
[16] J. eski, “Ho-Kashyap Classifier with Generalization Control,” Pattern Recognition Letters, vol. 24, no. 14, pp. 2281-2290, 2003.
[17] Y. Li and J. Shawe-Taylor, “Using KCCA for Japanese-English Cross-Language Information Retrieval and Classification,” J. Intelligent Information Systems, 2005.
[18] J. Ma, “Function Replacement vs. Kernel Trick,” Neurocomputing, vol. 50, pp. 479-483, 2003.
[19] T.M. Mitchell, Machine Learning. McGraw-Hill, 1997.
[20] J.M. Moguerza, A. Muñoz, and I.M. de Diego, “Fusion of Gaussian Kernels within Support Vector Classification,” Proc. Iberoamerican Congress on Pattern Recognition, pp. 945-953, 2006.
[21] M. Momma and K. Bennett, “A Pattern Search Method for Model Selection of Support Vector Regression,” Proc. Second SIAM Int'l Conf. Data Mining, pp. 261-274, 2002.
[22] K.-R. Müller, S. Mika, G. Rätsch, K. Tsuda, and B. Schölkopf, “An Introduction to Kernel-Based Learning Algorithms,” IEEE Trans. Neural Networks, vol. 12, no. 2, pp. 181-202, 2001.
[23] C.S. Ong, A.J. Smola, and R.C. Williamson, “Learning the Kernel with Hyperkernels,” J. Machine Learning Research, vol. 6, pp. 1043-1071, 2005.
[24] E. Pekalska, P. Paclik, and R.P.W. Duin, “A Generalized Kernel Approach to Dissimilarity-Based Classification,” J. Machine Learning Research, vol. 2, pp. 175-211, 2001.
[25] B. Schölkopf, S. Mika, C.J.C. Burges, P. Knirsch, K.-R. Müller, G. Rätsch, and A.J. Smola, “Input Space Versus Feature Space in Kernel-Based Methods,” IEEE Trans. Neural Networks, vol. 10, no. 5, pp. 1000-1017, 1999.
[26] J. Shawe-Taylor and N. Cristianini, Kernel Methods for Pattern Analysis. Cambridge Univ. Press, 2004.
[27] S. Sonnenburg, G. Rätsch, and C. Schäfer, “A General and Efficient Multiple Kernel Learning Algorithm,” Neural Information Processing Systems, 2005.
[28] S. Sonnenburg, G. Rätsch, C. Schäfer, and B. Schölkopf, “Large Scale Multiple Kernel Learning,” J. Machine Learning Research, 2006.
[29] I. Tsang, A. Kocsor, and J. Kwok, “Efficient Kernel Feature Extraction for Massive Data Sets,” Proc. Int'l Conf. Knowledge Discovery and Data Mining, 2006.
[30] V. Vapnik, Statistical Learning Theory. John Wiley & Sons, 1998.
[31] H. Xiong, M.N.S. Swamy, and M.O. Ahmad, “Optimizing the Kernel in the Empirical Feature Space,” IEEE Trans. Neural Networks, vol. 16, no. 2, pp.460-474, 2005.
[32] J. Ye, “Generalized Low Rank Approximation of Matrices,” Machine Learning, vol. 61, nos. 1-3, pp. 167-191, 2005.
21 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool