This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
A Novel Kernel Method for Clustering
May 2005 (vol. 27 no. 5)
pp. 801-804
Kernel Methods are algorithms that, by replacing the inner product with an appropriate positive definite function, implicitly perform a nonlinear mapping of the input data into a high-dimensional feature space. In this paper, we present a kernel method for clustering inspired by the classical K-Means algorithm in which each cluster is iteratively refined using a one-class Support Vector Machine. Our method, which can be easily implemented, compares favorably with respect to popular clustering algorithms, like K-Means, Neural Gas, and Self-Organizing Maps, on a synthetic data set and three UCI real data benchmarks (IRIS data, Wisconsin breast cancer database, Spam database).

[1] N. Aronszajn, “Theory of Reproducing Kernels,” Trans. Am. Math. Soc., vol. 686, pp. 337-404, 1950.
[2] M. Bazaraa and C.M. Shetty, Nonlinear Programming. New York: Wiley, 1979.
[3] A. Ben-Hur, D. Horn, H.T. Siegelmann, and V. Vapnik, “Support Vector Clustering,” J. Machine Learning Research, vol. 2, pp. 125-137, 2001.
[4] C. Berg, J.P.R. Christensen, and P. Ressel, Harmonic Analysis on Semigroups. New York: Springer-Verlag, 1984.
[5] C. Bishop, Neural Networks for Pattern Recognition. Cambridge: Cambridge Univ. Press, 1995.
[6] N. Cristianini and J. Shawe-Taylor, An Introduction to Support Vector Machines. Cambridge: Cambridge Univ. Press, 2000.
[7] A.P. Dempster, N.M. Laird, and D.B. Rubin, “Maximum Likelihood from Incomplete Data via the EM Algorithm,” J. Royal Statistical Soc., vol. 39, no. 1, pp. 1-38, 1977.
[8] R.A. Fisher, “The Use of Multiple Measurements in Taxonomic Problems,” Annals of Eugenics, vol. 7, pp. 179-188, 1936.
[9] M. Girolami, “Mercer Kernel Based Clustering in Feature Space,” IEEE Trans. Neural Networks, vol. 13, no. 3, pp. 780-784, 2002.
[10] R.M. Gray, Vector Quantization and Signal Compression. Dordrecht: Kluwer Academic Press, 1992.
[11] A.K. Jain, M.N. Murty, and P.J. Flynn, “Data Clustering: A Review,” ACM Computing Surveys, vol. 31, no. 3, pp. 264-323, 1999.
[12] T. Kohonen, “Self-Organized Formation of Topologically Correct Feature Maps,” Biological Cybernetics, vol. 43, no. 1, pp. 59-69, 1982.
[13] T. Kohonen, Self-Organizing Map. New York: Springer-Verlag, 1997.
[14] S.P. Lloyd, “An Algorithm for Vector Quantizer Design,” IEEE Trans. Comm., vol. 28, no. 1, pp. 84-95, 1982.
[15] T.E. Martinetz and K.J. Schulten, “Neural-Gas Network for Vector Quantization and Its Application to Time-Series Prediction,” IEEE Trans. Neural Networks, vol. 4, no. 4, pp. 558-569, 1993.
[16] M. Meila, “Comparing Clusterings,” Technical Report 418, Dept. of Statistics, Univ. of Washington, 2003.
[17] A.Y. Ng, M.I. Jordan, and Y. Weiss, “On Spectral Clustering: Analysis and an Algorithm,” Advances in Neural Information Processing Systems 14, pp. 849-856, 2001.
[18] B. Schölkopf, A.J. Smola, and K.-R Müller, “Nonlinear Component Analysis as a Kernel Eigenvalue Problem,” Technical Report No. 44, Max Planck Institut für Biologische Kybernetik, 1996.
[19] B. Schölkopf, R.C. Williamson, A.J. Smola, J. Shawe-Taylor, and J. Platt, “Support Vector Method for Novelty Detection,” Advances in Neural Information Processing Systems 12, pp. 526-532, 1999.
[20] D.M.J. Tax and R.P.W. Duin, “Support Vector Domain Description,” Pattern Recognition Letters, vol. 20, nos. 11-13, pp. 1191-1199, 1999.
[21] V. Vapnik, Statistical Learning Theory. New York: Wiley, 1998.
[22] W.H. Wolberg and O.L. Mangasarian, “Multisurface Method of Pattern Separation for Medical Diagnosis Applied to Breast Cytology,” Proc. Nat'l Academy of Sciences, USA, vol. 87, pp. 9193-9196, 1990.
[23] C.F.J. Wu, “On the Convergence Properties of the EM Algorithm,” The Annals of Statistics, vol. 11, no. 1, pp. 95-103, 1983.

Index Terms:
Kernel methods, one class SVM, clustering algorithms, EM algorithm, K-Means.
Citation:
Francesco Camastra, Alessandro Verri, "A Novel Kernel Method for Clustering," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 5, pp. 801-804, May 2005, doi:10.1109/TPAMI.2005.88
Usage of this product signifies your acceptance of the Terms of Use.