CSDL Home IEEE Transactions on Pattern Analysis & Machine Intelligence 2009 vol.31 Issue No.06 - June

Subscribe

Issue No.06 - June (2009 vol.31)

pp: 1017-1032

Elżbieta Pȩkalska , Univeristy of Manchester, Manchester

Bernard Haasdonk , Univeristy of Stuttgart, Stuttgart

ABSTRACT

Kernel methods are a class of well established and successful algorithms for pattern analysis thanks to their mathematical elegance and good performance. Numerous nonlinear extensions of pattern recognition techniques have been proposed so far based on the so-called kernel trick. The objective of this paper is twofold. First, we derive an additional kernel tool that is still missing, namely kernel quadratic discriminant (KQD). We discuss different formulations of KQD based on the regularized kernel Mahalanobis distance in both complete and class-related subspaces. Secondly, we propose suitable extensions of kernel linear and quadratic discriminants to indefinite kernels. We provide classifiers that are applicable to kernels defined by any symmetric similarity measure. This is important in practice because problem-suited proximity measures often violate the requirement of positive definiteness. As in the traditional case, KQD can be advantageous for data with unequal class spreads in the kernel-induced spaces, which cannot be well separated by a linear discriminant. We illustrate this on artificial and real data for both positive definite and indefinite kernels.

INDEX TERMS

machine learning, pattern recognition, kernel methods, indefinite kernels, quadratic discriminant

CITATION

Elżbieta Pȩkalska, Bernard Haasdonk, "Kernel Discriminant Analysis for Positive Definite and Indefinite Kernels",

*IEEE Transactions on Pattern Analysis & Machine Intelligence*, vol.31, no. 6, pp. 1017-1032, June 2009, doi:10.1109/TPAMI.2008.290REFERENCES

- [1] J. Bognár,
Indefinite Inner Product Spaces. Springer-Verlag, 1974.- [2] S. Canu, X. Mary, and A. Rakotomamonjy, “Functional Learning Through Kernel,”
Advances in Learning Theory: Methods, Models and Applications, J. Suykens, G. Horvath, S. Basu, C. Micchelli, and J. Vandewalle, eds., pp.89-110, IOS Press, 2003.- [3] C.-C. Chang and C.-J. Lin,
LIBSVM: A Library for Support Vector Machines, http://doi.acm.org/10.1145/500141. 500159http:/ /www.csie.ntu.edu.tw/~cjlinlibsvm, 2001.- [4] R. Der and D. Lee, “Large-Margin Classification in Banach Spaces,”
Proc. Int'l Conf. Artificial Intelligence and Statistics, vol. 2, pp.91-98, 2007.- [5] M. Dritschel and J. Rovnyak, “Operators on Indefinite Inner Product Spaces,”
Lectures on Operator Theory and Its Applications, Fields Inst. Monographs, pp.141-232, 1996.- [7] R. Duda, P. Hart, and D. Stork,
Pattern Classification, second ed. John Wiley & Sons, Inc., 2001.- [8] R. Duin, P. Juszczak, D. de Ridder, P. Paclík, E. Pekalska, and D. Tax, “PR-Tools,” http:/prtools.org, 2004.
- [9] L. Goldfarb, “A New Approach to Pattern Recognition,”
Progress in Pattern Recognition, L. Kanal and A. Rosenfeld, eds., vol.2, pp.241-402, Elsevier Science Publishers, 1985.- [12] B. Haasdonk, “Transformation Knowledge in Pattern Analysis with Kernel Methods-Distance and Integration Kernels,” PhD dissertation, Universität Freiburg, Institut für Informatik, 2005.
- [14] B. Haasdonk and E. Pekalska, “Indefinite Kernel Fisher Discriminant,”
Proc. Int'l Conf. Pattern Recognition, 2008.- [17] S. Hettich, C. Blake, and C. Merz, “UCI Repository of Machine Learning Databases,” http://www.ics.uci.edu/mlearnMLRepository.html , 1998.
- [19] S.-Y. Huang, C.-R. Hwang, and M.-H. Lin, “Kernel Fisher's Discriminant Analysis in Gaussian Reproducing Kernel Hilbert Space,” technical report, Academia Sinica, Taipei, Taiwan, 2005.
- [21] J. Laub and K.-R. Müller, “Feature Discovery in Non-Metric Pairwise Data,”
J. Machine Learning Research, pp.801-818, 2004.- [22] X. Mary, “Moore-Penrose Inverse in Krein Spaces,”
Integral Equations and Operator Theory, pp.419-433, 2008.- [23] S. Mika, G. Rätsch, J. Weston, B. Schölkopf, and K.-R. Müller, “Fisher Discriminant Analysis with Kernels,”
Proc. Neural Networks for Signal Processing, pp.41-48, 1999.- [24] S. Mika, A. Smola, and B. Schölkopf, “An Improved Training Algorithm for Kernel Fisher Discriminants,”
Proc. Int'l Conf. Artificial Intelligence and Statistics, pp.98-104, 2001.- [25] C. Ong, X. Mary, S. Canu, and S.A.J. Smola, “Learning with Non-Positive Kernels,”
Proc. Int'l Conf. Machine Learning, pp.639-646, 2004.- [26] E. Pekalska and R. Duin,
The Dissimilarity Representation for Pattern Recognition. Foundations and Applications. World Scientific, 2005.- [27] E. Pekalska and R. Duin, “Indefinite Kernel PCA,” work in progress, 2009.
- [28] E. Pekalska, A. Harol, R. Duin, B. Spillmann, and H. Bunke, “Non-Euclidean or Non-Metric Measures Can be Informative,”
Proc. Joint IAPR Workshops Structural and Syntactic Pattern Recognition and Statistical Techniques in Pattern Recognition, pp.871-880, 2006.- [29] C. Rasmussen and C. Williams,
Gaussian Processes for Machine Learning. MIT Press, 2006.- [30] V. Roth, J. Laub, J. Buhmann, and K.-R. Müller, “Going Metric: Denoising Pairwise Data,”
Proc. Advances in Neural Information Processing Systems, pp.841-856, 2003.- [31] J. Rovnyak, “Methods of Krein Space Operator Theory,”
Operator Theory: Advances and Applications, vol. 134, pp.31-66, 2002.- [33] B. Schölkopf and A. Smola,
Learning with Kernels. MIT Press, 2002.- [34] B. Schölkopf, A. Smola, and K.-R. Müller, “Nonlinear Component Analysis as a Kernel Eigenvalue Problem,”
Neural Computation, vol. 10, no. 5, 1998.- [35] B. Schölkopf, K. Tsuda, and J. Vert,
Kernel Methods in Computational Biology. MIT Press, 2004.- [36] J. Shawe-Taylor and N. Cristianini,
Kernel Methods for Pattern Analysis. Cambridge Univ. Press, 2004.- [37] P. Simard, Y.A. Le Cun, J.S. Denker, and B. Victorri, “Transformation Invariance in Pattern Recognition—Tangent Distance and Tangent Propagation,”
Int'l J. Imaging System and Technology, vol. 11, no. 3, pp.181-194, 2001.- [38] V. Vapnik,
Statistical Learning Theory. John Wiley & Sons, Inc., 1998.- [39] U. von Luxburg and O. Bousquet, “Distance-Based Classification with Lipschitz Functions,”
J. Machine Learning Research, vol. 5, pp.669-695, 2004.- [40] G. Wahba, “Support Vector Machines, Reproducing Kernel Hilbert Spaces and the Randomized GACV,”
Advances in Kernel Methods, Support Vector Learning, B. Schölkopf, C. Burges, and A.Smola, eds., pp.69-88, MIT Press, 1999. |