The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.06 - June (2009 vol.31)
pp: 1017-1032
Elżbieta Pȩkalska , Univeristy of Manchester, Manchester
Bernard Haasdonk , Univeristy of Stuttgart, Stuttgart
ABSTRACT
Kernel methods are a class of well established and successful algorithms for pattern analysis thanks to their mathematical elegance and good performance. Numerous nonlinear extensions of pattern recognition techniques have been proposed so far based on the so-called kernel trick. The objective of this paper is twofold. First, we derive an additional kernel tool that is still missing, namely kernel quadratic discriminant (KQD). We discuss different formulations of KQD based on the regularized kernel Mahalanobis distance in both complete and class-related subspaces. Secondly, we propose suitable extensions of kernel linear and quadratic discriminants to indefinite kernels. We provide classifiers that are applicable to kernels defined by any symmetric similarity measure. This is important in practice because problem-suited proximity measures often violate the requirement of positive definiteness. As in the traditional case, KQD can be advantageous for data with unequal class spreads in the kernel-induced spaces, which cannot be well separated by a linear discriminant. We illustrate this on artificial and real data for both positive definite and indefinite kernels.
INDEX TERMS
machine learning, pattern recognition, kernel methods, indefinite kernels, quadratic discriminant
CITATION
Elżbieta Pȩkalska, Bernard Haasdonk, "Kernel Discriminant Analysis for Positive Definite and Indefinite Kernels", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.31, no. 6, pp. 1017-1032, June 2009, doi:10.1109/TPAMI.2008.290
REFERENCES
[1] J. Bognár, Indefinite Inner Product Spaces. Springer-Verlag, 1974.
[2] S. Canu, X. Mary, and A. Rakotomamonjy, “Functional Learning Through Kernel,” Advances in Learning Theory: Methods, Models and Applications, J. Suykens, G. Horvath, S. Basu, C. Micchelli, and J. Vandewalle, eds., pp.89-110, IOS Press, 2003.
[3] C.-C. Chang and C.-J. Lin, LIBSVM: A Library for Support Vector Machines, http://doi.acm.org/10.1145/500141. 500159http:/ /www.csie.ntu.edu.tw/~cjlinlibsvm, 2001.
[4] R. Der and D. Lee, “Large-Margin Classification in Banach Spaces,” Proc. Int'l Conf. Artificial Intelligence and Statistics, vol. 2, pp.91-98, 2007.
[5] M. Dritschel and J. Rovnyak, “Operators on Indefinite Inner Product Spaces,” Lectures on Operator Theory and Its Applications, Fields Inst. Monographs, pp.141-232, 1996.
[6] M. Dubuisson and A. Jain, “Modified Hausdorff Distance for Object Matching,” Proc. Int'l Conf. Pattern Recognition, vol. 1, pp.566-568, 1994.
[7] R. Duda, P. Hart, and D. Stork, Pattern Classification, second ed. John Wiley & Sons, Inc., 2001.
[8] R. Duin, P. Juszczak, D. de Ridder, P. Paclík, E. Pekalska, and D. Tax, “PR-Tools,” http:/prtools.org, 2004.
[9] L. Goldfarb, “A New Approach to Pattern Recognition,” Progress in Pattern Recognition, L. Kanal and A. Rosenfeld, eds., vol.2, pp.241-402, Elsevier Science Publishers, 1985.
[10] J. Gower, “Metric and Euclidean Properties of Dissimilarity Coefficients,” J. Classification, vol. 3, pp.5-48, 1986.
[11] B. Haasdonk, “Feature Space Interpretation of SVMs with Indefinite Kernels,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 5, pp.482-492, May 2005.
[12] B. Haasdonk, “Transformation Knowledge in Pattern Analysis with Kernel Methods-Distance and Integration Kernels,” PhD dissertation, Universität Freiburg, Institut für Informatik, 2005.
[13] B. Haasdonk and H. Burkhardt, “Invariant Kernel Functions for Pattern Analysis and Machine Learning,” Machine Learning, vol. 68, no. 1, pp.35-61, 2007.
[14] B. Haasdonk and E. Pekalska, “Indefinite Kernel Fisher Discriminant,” Proc. Int'l Conf. Pattern Recognition, 2008.
[15] B. Hassibi, A. Sayed, and T. Kailath, “Linear Estimation in Krein Spaces—Part I: Theory,” IEEE Trans. Automatic Control, vol. 41, no. 1, pp.18-33, 1996.
[16] M. Hein, O. Bousquet, and B. Schölkopf, “Maximal Margin Classification for Metric Spaces,” J. Computer and System Sciences, vol. 71, no. 3, pp.333-359, 2005.
[17] S. Hettich, C. Blake, and C. Merz, “UCI Repository of Machine Learning Databases,” http://www.ics.uci.edu/mlearnMLRepository.html , 1998.
[18] S. Hochreiter and K. Obermayer, “Support Vector Machines for Dyadic Data,” Neural Computation, vol. 18, no. 6, pp.1472-1510, 2006.
[19] S.-Y. Huang, C.-R. Hwang, and M.-H. Lin, “Kernel Fisher's Discriminant Analysis in Gaussian Reproducing Kernel Hilbert Space,” technical report, Academia Sinica, Taipei, Taiwan, 2005.
[20] D. Jacobs, D. Weinshall, and Y. Gdalyahu, “Classification with Non-Metric Distances: Image Retrieval and Class Representation,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 6, pp.583-600, June 2000.
[21] J. Laub and K.-R. Müller, “Feature Discovery in Non-Metric Pairwise Data,” J. Machine Learning Research, pp.801-818, 2004.
[22] X. Mary, “Moore-Penrose Inverse in Krein Spaces,” Integral Equations and Operator Theory, pp.419-433, 2008.
[23] S. Mika, G. Rätsch, J. Weston, B. Schölkopf, and K.-R. Müller, “Fisher Discriminant Analysis with Kernels,” Proc. Neural Networks for Signal Processing, pp.41-48, 1999.
[24] S. Mika, A. Smola, and B. Schölkopf, “An Improved Training Algorithm for Kernel Fisher Discriminants,” Proc. Int'l Conf. Artificial Intelligence and Statistics, pp.98-104, 2001.
[25] C. Ong, X. Mary, S. Canu, and S.A.J. Smola, “Learning with Non-Positive Kernels,” Proc. Int'l Conf. Machine Learning, pp.639-646, 2004.
[26] E. Pekalska and R. Duin, The Dissimilarity Representation for Pattern Recognition. Foundations and Applications. World Scientific, 2005.
[27] E. Pekalska and R. Duin, “Indefinite Kernel PCA,” work in progress, 2009.
[28] E. Pekalska, A. Harol, R. Duin, B. Spillmann, and H. Bunke, “Non-Euclidean or Non-Metric Measures Can be Informative,” Proc. Joint IAPR Workshops Structural and Syntactic Pattern Recognition and Statistical Techniques in Pattern Recognition, pp.871-880, 2006.
[29] C. Rasmussen and C. Williams, Gaussian Processes for Machine Learning. MIT Press, 2006.
[30] V. Roth, J. Laub, J. Buhmann, and K.-R. Müller, “Going Metric: Denoising Pairwise Data,” Proc. Advances in Neural Information Processing Systems, pp.841-856, 2003.
[31] J. Rovnyak, “Methods of Krein Space Operator Theory,” Operator Theory: Advances and Applications, vol. 134, pp.31-66, 2002.
[32] A. Ruiz and P. Lopez-de Teruel, “Nonlinear Kernel-Based Statistical Pattern Analysis,” IEEE Trans. Neural Networks, vol. 12, no. 1, pp.16-32, 2001.
[33] B. Schölkopf and A. Smola, Learning with Kernels. MIT Press, 2002.
[34] B. Schölkopf, A. Smola, and K.-R. Müller, “Nonlinear Component Analysis as a Kernel Eigenvalue Problem,” Neural Computation, vol. 10, no. 5, 1998.
[35] B. Schölkopf, K. Tsuda, and J. Vert, Kernel Methods in Computational Biology. MIT Press, 2004.
[36] J. Shawe-Taylor and N. Cristianini, Kernel Methods for Pattern Analysis. Cambridge Univ. Press, 2004.
[37] P. Simard, Y.A. Le Cun, J.S. Denker, and B. Victorri, “Transformation Invariance in Pattern Recognition—Tangent Distance and Tangent Propagation,” Int'l J. Imaging System and Technology, vol. 11, no. 3, pp.181-194, 2001.
[38] V. Vapnik, Statistical Learning Theory. John Wiley & Sons, Inc., 1998.
[39] U. von Luxburg and O. Bousquet, “Distance-Based Classification with Lipschitz Functions,” J. Machine Learning Research, vol. 5, pp.669-695, 2004.
[40] G. Wahba, “Support Vector Machines, Reproducing Kernel Hilbert Spaces and the Randomized GACV,” Advances in Kernel Methods, Support Vector Learning, B. Schölkopf, C. Burges, and A.Smola, eds., pp.69-88, MIT Press, 1999.
[41] J. Wang, K. Plataniotis, J. Lu, and A. Venetsanopoulos, “Kernel Quadratic Discriminant Analysis for Small Sample Size Problem,” Pattern Recognition, vol. 41, no. 5, pp.1528-1538, 2008.
5 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool