This Article 
 Bibliographic References 
 Add to: 
Feature Space Interpretation of SVMs with Indefinite Kernels
April 2005 (vol. 27 no. 4)
pp. 482-492
Kernel methods are becoming increasingly popular for various kinds of machine learning tasks, the most famous being the support vector machine (SVM) for classification. The SVM is well understood when using conditionally positive definite (cpd) kernel functions. However, in practice, non-cpd kernels arise and demand application in SVMs. The procedure of "plugging” these indefinite kernels in SVMs often yields good empirical classification results. However, they are hard to interpret due to missing geometrical and theoretical understanding. In this paper, we provide a step toward the comprehension of SVM classifiers in these situations. We give a geometric interpretation of SVMs with indefinite kernel functions. We show that such SVMs are optimal hyperplane classifiers not by margin maximization, but by minimization of distances between convex hulls in pseudo-Euclidean spaces. By this, we obtain a sound framework and motivation for indefinite SVMs. This interpretation is the basis for further theoretical analysis, e.g., investigating uniqueness, and for the derivation of practical guidelines like characterizing the suitability of indefinite SVMs.

[1] D. Haussler, “Convolution Kernels on Discrete Structures,” Technical Report UCS-CRL-99-10, Univ. of California, Santa Cruz, 1999.
[2] H. Lodhi et al., “Text Classification Using String Kernels,” J. Machine Learning Research, vol. 2, pp. 419-444, 2002.
[3] C. Cortes, P. Haffner, and M. Mohri, “Rational Kernels,” Proc. Advances in Neural Information Processing Systems, vol. 15, 2003.
[4] N. Cristianini and J. Shawe-Taylor, An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge, U.K.: Cambridge Univ. Press, 2000.
[5] B. Schölkopf and A.J. Smola, Learning with Kernels. Cambridge, Mass.: MIT Press, 2002.
[6] O. Chapelle and V. Vapnik, “Model Selection for Support Vector Machines,” Proc. Advances in Neural Information Processing Systems, pp. 230-236, 2000.
[7] V. Vapnik, The Nature of Statistical Learning Theory. New York: Springer, 1995.
[8] C. Bahlmann, B. Haasdonk, and H. Burkhardt, “On-Line Handwriting Recognition with Support Vector Machines— A Kernel Approach,” Proc. Eighth Int'l Workshop Frontiers in Handwriting Recognition, pp. 49-54, 2002.
[9] D. DeCoste and B. Schölkopf, “Training Invariant Support Vector Machines,” Machine Learning, vol. 46, no. 1, pp. 161-190, 2002.
[10] B. Haasdonk and D. Keysers, “Tangent Distance Kernels for Support Vector Machines,” Proc. 16th Int'l Conf. Pattern Recognition, pp. 864-868, 2002.
[11] P.J. Moreno, P. Ho, and N. Vasconcelos, “A Kullback-Leibler Divergence Based Kernel for SVM Classification in Multimedia Applications,” Proc. Advances in Neural Information Processing Systems, vol. 16, pp. 1385-1392, 2004.
[12] H. Shimodaira et al., “Dynamic Time-Alignment Kernel in Support Vector Machine,” Proc. Advances in Neural Information Processing Systems, vol. 14, pp. 921-928, 2002.
[13] B. Schölkopf, “The Kernel Trick for Distances,” Technical Report MSR 2000-51, Microsoft Research, Redmond, Wash., 2000.
[14] H.-T. Lin and C.-J. Lin, “A Study on Sigmoid Kernels for SVM and the Training of non-PSD Kernels by SMO-Type Methods,” technical report, Nat'l Taiwan Univ., Mar. 2003.
[15] C.-C. Chang and C.-J. Lin, “LIBSVM: A Library for Support Vector Machines,”, 2001.
[16] M. Sellathurai and S. Haykin, “The Separability Theory of Hyperbolic Tangent Kernels and Support Vector Machines for Pattern Classification,” Proc. IEEE Int'l Conf. Acoustics, Speech, and Signal Processing, pp. 1021-1024, 1999.
[17] B. Haasdonk and C. Bahlmann, “Learning with Distance Substitution Kernels,” Proc. 26th DAGM Symp., pp. 220-227, 2004.
[18] T. Graepel et al., “Classification on Pairwise Proximity Data,” Proc. Advances in Neural Information Processing Systems, vol. 11, pp. 438-444, 1999.
[19] E. Pekalska, P. Paclik, and R. Duin, “A Generalized Kernel Approach to Dissimilarity Based Classification,” J. Machine Learning Research, vol. 2, pp. 175-211, 2001.
[20] L. Goldfarb, “A New Approach to Pattern Recognition,” Progress in Pattern Recognition 2, pp. 241-402, 1985.
[21] X. Mary, “Hilbertian Subspaces, Subdualities and Applications,” PhD dissertation, INSA Rouen, 2003.
[22] K.P. Bennett and E.J. Bredensteiner, “Duality and Geometry in SVM Classifiers,” Proc. 17th Int'l Conf. Machine Learning, pp. 57-64, 2000.
[23] D.J. Crisp and C.J.C. Burges, “A Geometric Interpretation of nu-SVM Classifiers,” Proc. Advances in Neural Information Processing Systems, vol. 12, pp. 223-229, 2000.
[24] B. Schölkopf et al., “New Support Vector Algorithms,” Neural Computation, vol. 12 pp. 1083-1121, 2000.
[25] C.-C. Chang and C.-J. Lin, “Training $\nu\hbox{-}{\rm{Support}}$ Vector Classifiers: Theory and Algorithms,” Neural Computation, vol. 13, no. 9, pp. 2119-2147, 2001.
[26] P.M. Pardalos and J.B. Rosen, Constrained Global Optimization: Algorithms and Applications. Berlin: Springer, 1987.
[27] O. Ronneberger and F. Pigorsch, “LIBSVMTL: A Support Vector Machine Template Library,” http://lmb.informatik. libsvmtl/, 2004.
[28] T. Graepel et al., “Classification on Proximity Data with LP-Machines,” Proc. Ninth Int'l Conf. Artificial Neural Networks, pp. 304-309, 1999.
[29] M. Hein and O. Bousquet, “Maximal Margin Classification for Metric Spaces,” Proc. 16th Ann. Conf. Computational Learning Theory, pp. 72-86, 2003.

Index Terms:
Support vector machine, indefinite kernel, pseudo-Euclidean space, separation of convex hulls, pattern recognition.
Bernard Haasdonk, "Feature Space Interpretation of SVMs with Indefinite Kernels," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 4, pp. 482-492, April 2005, doi:10.1109/TPAMI.2005.78
Usage of this product signifies your acceptance of the Terms of Use.