This Article 
 Bibliographic References 
 Add to: 
Efficient Algorithm for Localized Support Vector Machine
April 2010 (vol. 22 no. 4)
pp. 537-549
Haibin Cheng, Yahoo! Labs, Santa Clara
Pang-Ning Tan, Michigan State University, East Lansing
Rong Jin, Michigan State University, East Lansing
This paper presents a framework called Localized Support Vector Machine (LSVM) for classifying data with nonlinear decision surfaces. Instead of building a sophisticated global model from the training data, LSVM constructs multiple linear SVMs, each of which is designed to accurately classify a given test example. A major limitation of this framework is its high computational cost since a unique model must be constructed for each test example. To overcome this limitation, we propose an efficient implementation of LSVM, termed Profile SVM (PSVM). PSVM partitions the training examples into clusters and builds a separate linear SVM model for each cluster. Our empirical results show that 1) LSVM and PSVM outperform nonlinear SVM for all 20 of the evaluated data sets and 2) PSVM achieves comparable performance as LSVM in terms of model accuracy but with significant computational savings. We also demonstrate the efficacy of the proposed approaches in terms of classifying data with spatial and temporal dependencies.

[1] M. Aizerman, E. Braverman, and L. Rozonoer, "Theoretical Foundations of the Potential Function Method in Pattern Recognition Learning," Automation and Remote Control, vol. 25, pp. 821-837, 1964.
[2] C. Atkeson, A. Moore, and S. Schaal, "Locally Weighted Learning," Artificial Intelligence Rev., vol. 11, pp. 11-73, Apr. 1997.
[3] F. Bach, G. Lanckriet, and M. Jordan, "Fast Kernel Learning Using Sequential Minimal Optimization," Technical Report UCB/CSD-04-1307, Electrical Eng. and Computer Sciences Dept., Univ. of California, Berkley, 2004.
[4] R. Bekkerman, A. McCallum, and G. Huang, "Automatic Categorization of Email into Folders: Benchmark Experiments on Enron and Sri Corpora," technical report, Dept. of Computer Science, Univ. of Massachusetts, 2004.
[5] R. Bellman, Adaptive Control Processes. Princeton Univ. Press, 1961.
[6] L. Bottou and V. Vapnik, "Local Learning Algorithms," Neural Computation, vol. 4, no. 6 pp. 888-900, 1992.
[7] M. Brown, W. Grundy, D. Lin, N. Cristianini, C. Sugnet, T. Furey, M. AresJr., and D. Haussler, "Knowledge-Based Analysis of Microarray Gene Expression Data by Using Support Vector Machines," Proc. Nat'l Academy of Sciences USA, vol. 97, no. 1, pp. 262-267, Jan. 2000.
[8] C.J.C. Burges, "A Tutorial on Support Vector Machines for Pattern Recognition," Knowledge Discovery and Data Mining, vol. 2, no. 2, pp. 121-167, 1998.
[9] C. Chang and C. Lin, "LIBSVM: A Library for Support Vector Machines,", 2001.
[10] T. Cover and P. Hart, "Nearest Neighbor Pattern Classification," IEEE Trans. Information Theory, vol. 3, no. 1, pp. 21-27, Jan. 1967.
[11] C. Domeniconi, D. Gunopulos, and P. Jing, "Large Margin Nearest Neighbor Classifiers," IEEE Trans. Neural Networks, vol. 16, no. 4, pp. 899-909, July 2005.
[12] H. Frohlich and A. Zell, "Efficient Parameter Selection for Support Vector Machines in Classification and Regression via Model-Based Global Optimization," Proc. IEEE Int'l Joint Conf. Neural Networks, vol. 3, pp. 1431-1436, July 2005.
[13] N. Gilardi and S. Bengio, "Local Machine Learning Models for Spatial Data Analysis," Geographic Information and Decision Analysis, vol. 4, no. 1, pp. 11-28, 2000.
[14] S.R. Gunn, "Support Vector Machines for Classification and Regression," technical report, Univ. of Southampton, 1998.
[15] D. Hand and V. Vinciotti, "Choosing k for Two-Class Nearest Neighbour Classifiers with Unbalanced Classes," Pattern Recognition Letters, vol. 24, nos. 9/10, pp. 1555-1562, June 2003.
[16] T. Hastie and R. Tibshirani, "Discriminant Adaptive Nearest Neighbor Classification," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 18, no 6, pp. 607-616, June 1996.
[17] K. Hechenbichler and K. Schliep, "Weighted K-Nearest-Neighbor Techniques and Ordinal Classification," Discussion Paper 399, SFB 386, 2006.
[18] C. Hsu and C. Lin, "A Comparison of Methods for Multi-Class Support Vector Machines," technical report, Dept. of Computer Science and Information Eng., Nat'l Taiwan Univ., 2001.
[19] T. Joachims, "Transductive Inference for Text Classification Using Support Vector Machines Prodigy," Proc. Int'l Conf. Machine Learning, 1999.
[20] T. Kanungo, D. Mount, N. Netanyahu, C. Piatko, R. Silverman, and A. Wu, "An Efficient K-Means Clustering Algorithm: Analysis and Implementation," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 7, pp. 881-892, July 2002.
[21] S. Kiritchenko, S. Matwin, and S. Abu-Hakima, "Email Classification with Temporal Features," Proc. Conf. Intelligent Information Systems, New Trends in Intelligent Information Processing and Web Mining, pp. 523-534, 2004.
[22] K. Koperski, J. Han, and N. Stefanovic, "An Efficient Two-Step Method for Classification of Spatial Data," Proc. Int'l Symp. Spatial Data Handling, 1998.
[23] K.W. Lau and Q.H. Wu, "Local Prediction of Non-Linear Time Series Using Support Vector Regression," Pattern Recognition, vol. 41, no. 5, pp. 1556-1564, 2008.
[24] F. Melgani and L. Bruzzone, "Classification of Hyperspectral Remote Sensing Images with Support Vector Machines," IEEE Trans. Geoscience and Remote Sensing, vol. 42, no. 8, 1778-1790, Aug. 2004.
[25] D. Newman, S. Hettich, C. Blake, and C. Merz, "UCI Repository of Machine Learning Databases," , 1998.
[26] E. Osuna, R. Freund, and F. Girosi, "Training Support Vector Machines: An Application to Face Detection," Proc. Conf. Computer Vision and Pattern Recognition, pp. 130-136, June 1997.
[27] M. Pawlak and M. Ng, "On Kernel and Radial Basis Function Techniques for Classification and Function Recovering," Proc. Int'l Conf. Pattern Recognition, pp. B:454-B:456, 1994.
[28] J. Platt, N. Cristianini, and J. Shawe-Taylor, "Large Margin Dags for Multiclass Classification," Advances in Neural Information Processing Systems 12, pp. 547-553, MIT Press, 2000.
[29] R. Rifkin and A. Klautau, "In Defense of One-vs-All Classification," J. Machine Learning Research, vol. 5, pp. 101-141, 2004.
[30] Temporal, Spatial, and Spatio-Temporal Data Mining, J. Roddick and K. Hornsby, eds. Springer, 2001.
[31] A. Schrijver, Theory of Linear and Integer Programming. John Wiley and Sons, 1998.
[32] V. Vapnik, Statistical Learning Theory. Wiley Interscience, 1998.
[33] P. Vincent and Y. Bengio, "K-Local Hyperplane and Convex Distance Nearest Neighbor Algorithms," Advances in Neural Information Processing Systems, pp. 985-992, MIT Press, 2001.
[34] K.Q. Weinberger, J. Blitzer, and L.K. Saul, "Distance Metric Learning for Large Margin Nearest Neighbor Classification," Advances in Neural Information Processing Systems 18, pp. 1473-1480, MIT Press, 2006.
[35] H. Zhang, A.C. Berg, M. Maire, and J. Malik, "Svm-Knn: Discriminative Nearest Neighbor for Visual Object Recognition," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2006.

Index Terms:
Classification, support vector machine, kernel-based learning, local learning.
Haibin Cheng, Pang-Ning Tan, Rong Jin, "Efficient Algorithm for Localized Support Vector Machine," IEEE Transactions on Knowledge and Data Engineering, vol. 22, no. 4, pp. 537-549, April 2010, doi:10.1109/TKDE.2009.116
Usage of this product signifies your acceptance of the Terms of Use.