Publication 2011 Issue No. 2 - February Abstract - Nonconvex Online Support Vector Machines
 This Article Share Bibliographic References Add to: Digg Furl Spurl Blink Simpy Google Del.icio.us Y!MyWeb Search Similar Articles Articles by Şeyda Ertekin Articles by Léon Bottou Articles by C. Lee Giles
Nonconvex Online Support Vector Machines
February 2011 (vol. 33 no. 2)
pp. 368-381
 ASCII Text x Şeyda Ertekin, Léon Bottou, C. Lee Giles, "Nonconvex Online Support Vector Machines," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 2, pp. 368-381, February, 2011.
 BibTex x @article{ 10.1109/TPAMI.2010.109,author = {Şeyda Ertekin and Léon Bottou and C. Lee Giles},title = {Nonconvex Online Support Vector Machines},journal ={IEEE Transactions on Pattern Analysis and Machine Intelligence},volume = {33},number = {2},issn = {0162-8828},year = {2011},pages = {368-381},doi = {http://doi.ieeecomputersociety.org/10.1109/TPAMI.2010.109},publisher = {IEEE Computer Society},address = {Los Alamitos, CA, USA},}
 RefWorks Procite/RefMan/Endnote x TY - JOURJO - IEEE Transactions on Pattern Analysis and Machine IntelligenceTI - Nonconvex Online Support Vector MachinesIS - 2SN - 0162-8828SP368EP381EPD - 368-381A1 - Şeyda Ertekin, A1 - Léon Bottou, A1 - C. Lee Giles, PY - 2011KW - Online learningKW - nonconvex optimizationKW - support vector machinesKW - active learning.VL - 33JA - IEEE Transactions on Pattern Analysis and Machine IntelligenceER -
Şeyda Ertekin, Massachusetts Institute of Technology, Cambridge
Léon Bottou, NEC Labs America, Princeton
C. Lee Giles, The Pennsylvania State University, University Park
In this paper, we propose a nonconvex online Support Vector Machine (SVM) algorithm (LASVM-NC) based on the Ramp Loss, which has the strong ability of suppressing the influence of outliers. Then, again in the online learning setting, we propose an outlier filtering mechanism (LASVM-I) based on approximating nonconvex behavior in convex optimization. These two algorithms are built upon another novel SVM algorithm (LASVM-G) that is capable of generating accurate intermediate models in its iterative steps by leveraging the duality gap. We present experimental results that demonstrate the merit of our frameworks in achieving significant robustness to outliers in noisy data classification where mislabeled training instances are in abundance. Experimental evaluation shows that the proposed approaches yield a more scalable online SVM algorithm with sparser models and less computational running time, both in the training and recognition phases, without sacrificing generalization performance. We also point out the relation between nonconvex optimization and min-margin active learning.

[1] O. Bousquet and A. Elisseeff, "Stability and Generalization," J. Machine Learning, vol. 2, pp. 499-526, 2002.
[2] B. Schölkopf and A.J. Smola, Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, 2002.
[3] J. Shawe-Taylor and N. Cristianini, Kernel Methods for Pattern Analysis. Cambridge Univ. Press, 2004.
[4] C. Cortes and V. Vapnik, "Support Vector Networks," Machine Learning, vol. 20, pp. 273-297, 1995.
[5] L. Mason, P.L. Bartlett, and J. Baxter, "Improved Generalization through Explicit Optimization of Margins," Machine Learning, vol. 38, pp. 243-255, 2000.
[6] N. Krause and Y. Singer, "Leveraging the Margin More Carefully," Proc. Int'l Conf. Machine Learning, p. 63, 2004.
[7] F. Perez-Cruz, A. Navia-Vazquez, and A.R. Figueiras-Vidal, "Empirical Risk Minimization for Support Vector Classifiers," IEEE Trans. Neural Networks, vol. 14, no. 2, pp. 296-303, Mar. 2002.
[8] D.S.L. Xu and K. Cramer, "Robust Support Vector Machine Training via Convex Outlier Ablation," Proc. 21st Nat'l Conf. Artificial Intelligence, 2006.
[9] Y. Liu, X. Shen, and H. Doss, "Multicategory $\psi$ Learning and Support Vector Machine: Computational Tools," J. Computational and Graphical Statistics, vol. 14, pp. 219-236, 2005.
[10] L. Wang, H. Jia, and J. Li, "Training Robust Support Vector Machine with Smooth Ramp Loss in the Primal Space," Neurocomputing, vol. 71, pp. 3020-3025, 2008.
[11] A.L. Yuille and A. Rangarajan, "The Concave-Convex Procedure (CCCP)," Advances in Neural Information Processing Systems. MIT Press, 2002.
[12] R. Collobert, F. Sinz, J. Weston, and L. Bottou, "Trading Convexity for Scalability," Proc. Int'l Conf. Machine Learning, pp. 201-208, 2006.
[13] A. Bordes, S. Ertekin, J. Weston, and L. Bottou, "Fast Kernel Classifiers with Online and Active Learning," J. Machine Learning Research, vol. 6, pp. 1579-1619, 2005.
[14] S. Ertekin, J. Huang, L. Bottou, and L. Giles, "Learning on the Border: Active Learning in Imbalanced Data Classification," Proc. ACM Conf. Information and Knowledge Management, pp. 127-136, 2007.
[15] J.C. Platt, "Fast Training of Support Vector Machines Using Sequential Minimal Optimization," Advances in Kernel Methods: Support Vector Learning, pp. 185-208, MIT Press, 1999.
[16] S. Shalev-Shwartz and N. Srebro, "SVM Optimization: Inverse Dependence on Training Set Size," Proc. Int'l Conf. Machine Learning, pp. 928-935, 2008.
[17] S.S. Keerthi, S.K. Shevade, C. Bhattacharyya, and K.R.K. Murthy, "Improvements to Platt's SMO Algorithm for SVM Classifier Design," Neural Computation, vol. 13, no. 3, pp. 637-649, 2001.
[18] O. Chapelle, "Training a Support Vector Machine in the Primal," Neural Computation, vol. 19, no. 5, pp. 1155-1178, 2007.
[19] I. Steinwart, "Sparseness of Support Vector Machines," J. Machine Learning Research, vol. 4, pp. 1071-1105, 2003.
[20] G. Schohn and D. Cohn, "Less Is More: Active Learning with Support Vector Machines," Proc. Int'l Conf. Machine Learning, pp. 839-846, 2000.
[21] T. Joachims, "Text Categorization with Support Vector Machines: Learning with Many Relevant Features," Technical Report 23, Univ. Dortmund, 1997.
[22] S. Tong and D. Koller, "Support Vector Machine Active Learning with Applications to Text Classification," J. Machine Learning Research, vol. 2, pp. 45-66, 2001.
[23] T. Glasmachers and C. Igel, "Second-Order SMO Improves SVM Online and Active Learning," Neural Computation, vol. 20, no. 2, pp. 374-382, 2008.

Index Terms:
Online learning, nonconvex optimization, support vector machines, active learning.
Citation:
Şeyda Ertekin, Léon Bottou, C. Lee Giles, "Nonconvex Online Support Vector Machines," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 2, pp. 368-381, Feb. 2011, doi:10.1109/TPAMI.2010.109