This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
A Support Vector Machine with a Hybrid Kernel and Minimal Vapnik-Chervonenkis Dimension
April 2004 (vol. 16 no. 4)
pp. 385-395
Ying Tan, IEEE
Jun Wang, IEEE

Abstract—This paper presents a mechanism to train support vector machines (SVMs) with a hybrid kernel and minimal Vapnik-Chervonenkis (VC) dimension. After describing the VC dimension of sets of separating hyperplanes in a high-dimensional feature space produced by a mapping related to kernels from the input space, we proposed an optimization criterion to design SVMs by minimizing the upper bound of the VC dimension. This method realizes a structural risk minimization and utilizes a flexible kernel function such that a superior generalization over test data can be obtained. In order to obtain a flexible kernel function, we develop a hybrid kernel function and a sufficient condition to be an admissible Mercer kernel based on common Mercer kernels (polynomial, radial basis function, two-layer neural network, etc.). The nonnegative combination coefficients and parameters of the hybrid kernel are determined subject to the minimal upper bound of the VC dimension of the learning machine. The use of the hybrid kernel results in a better performance than those with a single common kernel. Experimental results are discussed to illustrate the proposed method and show that the SVM with the hybrid kernel outperforms that with a single common kernel in terms of generalization power.

[1] V.N. Vapnik, The Nature of Statistical Learning Theory. New York: Springer Verlag, 1995.
[2] B.E. Boser, I.M. Guyon, and V.N. Vapnik, A Training Algorithm for Optimal Margin Classifiers Proc. Fifth Ann. Workshop Computing Learning Theory, pp. 144-152, 1995.
[3] V.N. Vapnik, An Overview of Statistical Learning Theory IEEE Trans. Neural Networks, vol. 10, no. 5, pp. 988-999, 1999.
[4] C. Cortes and V.N. Vapnik, Support-Vector Networks Machine Learning, vol. 20, pp. 273-297, 1995.
[5] B. Scholkopf, Support Vector Learning PhD dissertation, Technische Universitat Berlin, Germany, 1997.
[6] V. Vapnik, S. Golowich, and A. Smola, Support Vector Method for Function Approximation, Regression Estimation and Signal Processing Advances in Neural Information Processing Systems, vol. 9, Cambridge, Mass.: MIT Press, 1997.
[7] E. Osuna, R. Freund, and F. Girosi, Training Support Vector Machines: An Application to Face Detection Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 17-19, 1997.
[8] M. Pontil and A. Verri, Object Recognition with Support Vector Machines IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, pp. 637-646, 1998.
[9] O.L. Mangasarian and D.R. Musicant, Successive Overrelaxation for Support Vector Machines IEEE Trans. Neural Networks, vol. 10, no. 5, pp. 1032-1037, 1999.
[10] B. Scholkopf, A. Smola, and K.-R. Muler, Nonlinear Component Analysis as a Kernel Eigenvalue Problem Neural Computation, vol. 10, pp. 1299-1319, 1998.
[11] N. Aronszajn, Theory of Reproducing Kernels Trans. Am. Math. Soc., vol. 68, pp. 337-404, 1950.
[12] J. Shawe-Taylor, P.L. Bartlett, R.C. Willianmson, and M. Anthony, Structural Risk Minimization over Data-Dependent Hierarchies IEEE Trans. Information Theory, vol. 44, no. 5, pp. 1926-1940, 1998.
[13] B. Scholkopf, K. Sung, C.J.C. Burges, and F. Girosi, Comparing Support Vector Machines with Gaussian Kernels to Radial Basis Function Classifiers IEEE Trans. Signal Processing, vol. 45, no. 11, pp. 2758-2765, 1999.
[14] B. Carl and I. Stephani, Entropy, Compactness, and the Approximation of Operators. Cambridge, UK: Cambridge Univ. Press, 1990.
[15] C.J.C. Burges, Simplified Support Vector Decision Rules Proc. 13th Int'l Conf. Machine Learning, L. Saitta, ed., 1996.
[16] C.J.C. Burges and B. Scholkopf, Improving the Accuracy and Speed of Support Vector Machines Proc. Neural Information Processing Systems, vol. 9, M. Mozer, M. Jordan, T. Petsche, eds., 1997.
[17] J.A.K. Suykens and J. Vandewalle, Training Multilayer Perceptron Classifiers Based on a Modified Support Vector Method IEEE Trans. Neural Networks, vol. 10, no. 4, pp. 907-911, 1999.
[18] T. Joachims, Making Large-Scale SVM Learning Practical Advances in Kernel Methods Support Vector Learning, B. Scholkopf, C.J.C. Burges, and A.J. Smola, eds., Cambridge, Mass.: MIT Press, 1999.
[19] J. Platt, Sequential Minimal Optimization: A Fast Algorithm for Training Support Vector Machines Advances in Kernel Methods Support Vector Learning, B. Scholkopf, C.J.C. Burges, and A.J. Smola, eds., Cambridge, Mass.: MIT Press, pp. 185-208, 1999.
[20] Y. Tan, Y. Xia, and J. Wang, Neural Network Realization of Support Vector Machines for Pattern Classification Proc. IEEE Int'l Joint Conf. Neural Networks, vol. 5, pp. 411-416, 2000.
[21] D. Haussler, Convolution Kernels over Discrete Structures Technical Report, UCSC-CRL-99-10, Univ. of California in Santa Cruz, 2000. http://www.cse.ucsc.edu/hausslerconvolutions.ps .
[22] M. Cristianini, J. Shawe-Taylor, and C. Campbell, Dynamically Adapting Kernels in Support Vector Machines NIPS-98 or NeuroCOLT2 Technical Report Series NC2-TR-1998-017, Dept. of Engineering Mathematics, Univ. of Bristol, U.K., 1998. ftp://www.neurocolt.com/pub/neurocolt/tech_reports/ 199898017.ps.Z.
[23] N. Cristianini and J. Shawe-Taylor, An Introduction to Support Vector Machines. Cambridge Univ. Press, 2000. http:/www.suppor-vector.net.
[24] T. Evgeniou, M. Pontil, and T. Poggio, Regularization Networks and Support Vector Machines Advances in Computational Mathematics, 2003. http://www.ai.mit.edu/people/theos/publications reviewall.ps.
[25] B. Scholkopf, C. Burges, and V. Vapnik, Extracting Support Data for a Given Task Proc. First Int'l Conf. Knowledge Discovery and Data Mining, U.M. Fayyad and R. Uthurusamy, eds., 1995.
[26] S. Amari and S. Wu, Improving Support Vector Machine Classifier by Modifying Kernel Functions Neural Networks, vol. 12, pp. 783-789, 1999.
[27] T. Mercer, Functions of Positive and Negative Type and Their Connection with the Theory of Integral Equations Philosophical Trans. of the Royal Soc. of London, Series A, pp. 415-446, 1909.
[28] R.C. Williamson, A.J. Smola, and B. Scholkopf, Generalization Performance of Regularization Networks and Support Vector Machines via Entropy Numbers of Compact Operators Technical Report 19, NeuroCOLT, 1998. http:/www.neurocolt.com.
[29] C.A. Micchelli, Interpolation of Scattered Data: Distance Matrices and Conditionally Positive Definite Functions Constructive Approximation, vol. 2, pp. 11-22, 1986.
[30] Y. LeCun et al., Comparison of Learning Algorithms for Handwritten Digit Recognition Proc. Int'l Conf. Artificial Neural Networks, pp. 53-60, 1995.
[31] K. Duan, S. Sathiya Keerthi, and A. Neow Poo, Evaluation of Simple Performance Measures for Tuning SVM Hyperparameters http://guppy.mpe.nus.edu.sg/mpesskcomparison.shtml , 2002.

Index Terms:
Support vector machines, structural risk minimization, VC dimension, hybrid kernel function, hyperplane.
Citation:
Ying Tan, Jun Wang, "A Support Vector Machine with a Hybrid Kernel and Minimal Vapnik-Chervonenkis Dimension," IEEE Transactions on Knowledge and Data Engineering, vol. 16, no. 4, pp. 385-395, April 2004, doi:10.1109/TKDE.2004.1269664
Usage of this product signifies your acceptance of the Terms of Use.