This Article 
 Bibliographic References 
 Add to: 
Nonsmooth Optimization Techniques for Semisupervised Classification
December 2007 (vol. 29 no. 12)
pp. 1-1
A. Astorino, Univ. della Calabria, Rende
We apply nonsmooth optimization techniques to classification problems, with particular reference to the transductive support vector machine (TSVM) approach, where the considered decision function is nonconvex and nondifferentiable, hence difficult to minimize. We present some numerical results obtained by running the proposed method on some standard test problems drawn from the binary classification literature.

[1] A. Astorino, A. Fuduli, and M. Gaudioso, “Analysis of Regularization Techniques in Convex Nondifferentiable Optimization,” Operations Research Proc. 1996, pp. 20-25, 1997.
[2] M. Belkin and P. Niyogi, “Semi-Supervised Learning on Riemannian Manifolds,” Machine Learning, vol. 56, pp. 209-239, 2004.
[3] K.P. Bennett and E. Bredensteiner, “Geometry in Learning,” Geometry at Work, C. Gorini, ed., pp. 132-145, Math. Assoc. of Am., 2000.
[4] K.P. Bennett and A. Demiriz, “Semi-Supervised Support Vector Machines,” Advances in Neural Information Processing Systems, vol. 12, M.S. Kearns, S.A. Solla, and D.A. Cohn, eds., pp. 368-374, MIT Press, 1998.
[5] K.P. Bennett and A. Demiriz, “Optimization Approaches to Semi–Supervised Learning,” Complementarity: Applications, Algorithms and Extensions, M. Ferris, O. Mangasarian, and J.S. Pang, eds., pp.121-141, Kluwer Academic, 2001.
[6] K.P. Bennett and O.L. Mangasarian, “Robust Linear Programming Discrimination of Two Linearly Inseparable Sets,” Optimization Methods and Software, vol. 1, pp. 23-34, 1992.
[7] Semi-Supervised Learning, O. Chapelle, B. Schölkopf, and A.Zien, eds., MIT Press, 2006, http://www.kyb.tuebingen.mpg.dessl-book, in press.
[8] O. Chapelle and A. Zien, “Semi-Supervised Classification by Low Density Separation,” Proc. 10th Int'l Workshop Artificial Intelligence and Statistics, pp. 57-64, 2005.
[9] F.H. Clarke, Optimization and Nonsmooth Analysis. John Wiley & Sons, 1983.
[10] R. Collobert, J. Weston, and L. Bottou, “Trading Convexity for Scalability,” technical report, 2005, http://ronan.collobert.compublications/.
[11] N. Cristianini and J. Shawe-Taylor, An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge Univ. Press, 2000.
[12] A. Fuduli and M. Gaudioso, “A Tuning Strategy for the Proximity Parameter in Convex Minimization,” J. Optimization Theory and Applications, vol. 130, pp. 95-112, 2006.
[13] A. Fuduli, M. Gaudioso, and G. Giallombardo, “A DC Piecewise Affine Model and Bundling Technique in Nonconvex Nonsmooth Minimization,” Optimization Methods and Software, vol. 18, pp. 89-102, 2004.
[14] A. Fuduli, M. Gaudioso, and G. Giallombardo, “Minimizing Nonconvex Nonsmooth Functions via Cutting Planes and Proximity Control,” SIAM J. Optimization, vol. 14, pp. 743-756, 2004.
[15] G. Fung and O.L. Mangasarian, “Proximal Support Vector Machine Classifiers,” Proc. Int'l Conf. Knowledge Discovery and Data Mining, pp. 77-86, Aug. 2001.
[16] G. Fung and O.L. Mangasarian, “Semi-Supervised Support Vector Machines for Unlabeled Data Classification,” Optimization Methods and Software, vol. 15, pp. 29-44, 2001.
[17] P.E. Gill, W. Murray, and M.H. Wright, Practical Optimization. Academic Press, 1981.
[18] J.B. Hiriart-Urruty and C. Lemaréchal, Convex Analysis and Minimization Algorithms, vol. I-II, Springer-Verlag, 1993.
[19] T. Joachims, “Making Large-Scale SVM Learning Practical,” Advances in Kernel Methods—Support Vector Learning, B. Schölkopf, C. Burges, and A. Smola, eds. MIT Press, 1999.
[20] T. Joachims, “Transductive Inference for Text Classification Using Support Vector Machines,” Proc. Int'l Conf. Machine Learning, pp.200-209, 1999.
[21] T. Joachims, “Transductive Learning via Spectral Graph Partitioning,” Proc. Int'l Conf. Machine Learning, pp. 290-297, 2003.
[22] K.C. Kiwiel, “An Aggregate Subgradient Method for Nonsmooth Convex Minimization,” Math. Programming, vol. 27, pp. 320-341, 1983.
[23] K.C. Kiwiel, “Methods of Descent for Nondifferentiable Optimization,” Lecture Notes in Math., vol. 1133, Springer-Verlag, 1985.
[24] K.C. Kiwiel, “Proximity Control in Bundle Methods for Convex Nondifferentiable Minimization,” Math. Programming, vol. 46, pp.105-122, 1990.
[25] C. Lemaréchal, “An Algorithm for Minimizing Convex Functions,” Proc. IFIP Congress, J. Rosenfeld, ed., 1974.
[26] M. Mäkelä and P. Neittaanmäki, “Nonsmooth Optimization,” World Scientific, 1992.
[27] R. Mifflin, “An Algorithm for Constrained Optimization with Semismooth Functions,” Math. of Operations Research, vol. 2, pp.191-207, 1977.
[28] P.M. Murphy and D.W. Aha, “UCI Repository of Machine Learning Databases,” , 1992.
[29] S. Odewahn, E. Stockwell, R. Pennington, R. Humphreys, and W. Zumach, “Automated Star/Galaxy Discrimination with Neural Networks,” Astronomical J., vol. 103, pp. 318-331, 1992.
[30] B. Schölkopf, C.J.C. Burges, and A.J. Smola, Advances in Kernel Methods, Support Vector Learning. MIT Press, 1999.
[31] H. Schramm and J. Zowe, “A Version of the Bundle Idea for Minimizing a Nonsmooth Function: Conceptual Idea, Convergence Analysis,” SIAM J. Optimization, vol. 1, pp. 121-152, 1992.
[32] J. Shawe-Taylor and N. Cristianini, Kernel Methods for Pattern Analysis. Cambridge Univ. Press, 2004.
[33] X. Shen, G.C. Tseng, X. Zhang, and W.H. Wong, “On Psi-Learning,” J. Am. Statistical Assoc., vol. 98, pp. 724-734, 2003.
[34] V. Vapnik, The Nature of the Statistical Learning Theory. Springer-Verlag, 1995.
[35] P. Wolfe, “A Method of Conjugate Subgradients for Minimizing Nondifferentiable Functions,” Nondifferentiable Optimization, M.Balinski and P. Wolfe, eds., vol. 3, Math. Programming Study, pp.145-173, 1975.
[36] A.L. Yuille and A. Rangarajan, “The Concave-Convex Procedure (CCCP),” Advances in Neural Information Processing Systems, vol. 14, T.G. Detterich, S. Becker, and Z. Ghahramani, eds., MIT Press, 2002.
[37] D. Zhou, O. Bousquet, T.N. Lal, J. Weston, and B. Schölkopf, “Learning with Local and Global Consistency,” Advances in Neural Information Processing System, MIT Press, 2003.
[38] D. Zhou and B. Schölkopf, “Learning from Labeled and Unlabeled Data Using Random Walks,” Proc. 26th Die Dutsche Arbeitsgemeinschaft für Mustererkennung (DAGM) Symp., C.E. Rasmussen, H.H. Büthoff, M.A. Giese, and B. Schölkopf, eds., pp.237-244, Springer-Verlag, 2004.
[39] X. Zhu, J. Kandola, Z. Ghahramani, and J. Lafferty, “Nonparametric Transforms of Graph Kernels for Semi-Supervised Learning,” Proc. Int'l Conf. Neural Information Processing Systems, 2004.

Index Terms:
Support vector machines,Semisupervised learning,Support vector machine classification,Testing,Pattern classification,Predictive models,Optimization methods,Machine learning,Mathematical model,Computational efficiency,bundle methods,semi--supervised learning,nonsmooth optimization
A. Astorino, A. Fuduli, "Nonsmooth Optimization Techniques for Semisupervised Classification," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 12, pp. 1-1, Dec. 2007, doi:10.1109/TPAMI.2007.1102
Usage of this product signifies your acceptance of the Terms of Use.