Subscribe
Issue No.10 - October (2010 vol.32)
pp: 1888-1898
Mark A. Davenport , Stanford University, Stanford
Richard G. Baraniuk , Rice University, Houston
Clayton D. Scott , University of Michigan, Ann Arbor
ABSTRACT
This paper studies the training of support vector machine (SVM) classifiers with respect to the minimax and Neyman-Pearson criteria. In principle, these criteria can be optimized in a straightforward way using a cost-sensitive SVM. In practice, however, because these criteria require especially accurate error estimation, standard techniques for tuning SVM parameters, such as cross-validation, can lead to poor classifier performance. To address this issue, we first prove that the usual cost-sensitive SVM, here called the 2C-SVM, is equivalent to another formulation called the 2\nu-SVM. We then exploit a characterization of the 2\nu-SVM parameter space to develop a simple yet powerful approach to error estimation based on smoothing. In an extensive experimental study, we demonstrate that smoothing significantly improves the accuracy of cross-validation error estimates, leading to dramatic performance gains. Furthermore, we propose coordinate descent strategies that offer significant gains in computational efficiency, with little to no loss in performance.
INDEX TERMS
Minimax classification, Neyman-Pearson classification, support vector machine, error estimation, parameter selection.
CITATION
Mark A. Davenport, Richard G. Baraniuk, Clayton D. Scott, "Tuning Support Vector Machines for Minimax and Neyman-Pearson Classification", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.32, no. 10, pp. 1888-1898, October 2010, doi:10.1109/TPAMI.2010.29
REFERENCES
 [1] A. Cannon, J. Howse, D. Hush, and C. Scovel, "Learning with the Neyman-Pearson and Min-Max Criteria," Technical Report LA-UR 02-2951, Los Alamos Nat'l Laboratory, 2002. [2] F. Sebastiani, "Machine Learning in Automated Text Categorization," ACM Computing Surveys, vol. 34, pp. 1-47, 2002. [3] S. Bengio, J. Mariéthoz, and M. Keller, "The Expected Performance Curve," Proc. Int'l Conf. Machine Learning, 2005. [4] C.D. Scott and R.D. Nowak, "A Neyman-Pearson Approach to Statistical Learning," IEEE Trans. Information Theory, vol. 51, no. 11, pp. 3806-3819, Nov. 2005. [5] L.L. Scharf, Statistical Signal Processing: Detection, Estimation, and Time Series Analysis. Addison-Wesley, 1991. [6] H.G. Chew, R.E. Bogner, and C.C. Lim, "Dual-$\nu$ Support Vector Machine with Error Rate and Training Size Biasing," Proc. IEEE Int'l Conf. Acoustics, Speech, and Signal Processing, pp. 1269-1272, 2001. [7] E. Osuna, R. Freund, and F. Girosi, "Support Vector Machines: Training and Applications," Technical Report A.I. Memo No. 1602, MIT Artificial Intelligence Laboratory, Mar. 1997. [8] K. Veropoulos, N. Cristianini, and C. Campbell, "Controlling the Sensitivity of Support Vector Machines," Proc. Int'l Joint Conf. Artificial Intelligence, 1999. [9] Y. Lin, Y. Lee, and G. Wahba, "Support Vector Machines for Classification in Nonstandard Situations," Technical Report No. 1016, Dept. of Statistics, Univ. of Wisconsin, Mar. 2000. [10] M.A. Davenport, R.G. Baraniuk, and C.D. Scott, "Controlling False Alarms with Support Vector Machines," Proc. IEEE Int'l Conf. Acoustics, Speech, and Signal Processing, 2006. [11] M.A. Davenport, R.G. Baraniuk, and C.D. Scott, "Minimax Support Vector Machines," Proc. IEEE Workshop Statistical Signal Processing, 2007. [12] M.A. Davenport, "Error Control for Support Vector Machines," MS thesis, Rice Univ., Apr. 2007. [13] C.C. Chang and C.J. Lin, LIBSVM: A Library for Support Vector Machines, http://www.csie.ntu.edu.tw/cjlinlibsvm, 2001. [14] B. Schölkopf and A.J. Smola, Learning with Kernels. MIT Press, 2002. [15] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge Univ. Press, 2004. [16] C. Cortes and V. Vapnik, "Support-Vector Networks," Machine Learning, vol. 20, no. 3, pp. 273-297, 1995. [17] B. Schölkopf, A.J. Smola, R. Williams, and P. Bartlett, "New Support Vector Algorithms," Neural Computation, vol. 12, pp. 1083-1121, 2000. [18] C.C. Chang and C.J. Lin, "Training $\nu$ -Support Vector Classifiers: Theory and Algorithms," Neural Computation, vol. 13, pp. 2119-2147, 2001. [19] C.D. Scott, "Performance Measures for Neyman-Pearson Classification," IEEE Trans. Information Theory, vol. 53, no. 8, pp. 2852-2863, Aug. 2007. [20] J. Demšar, "Statistical Comparisons of Classifiers over Multiple Data Sets," J. Machine Learning Research, vol. 7, pp. 1-30, 2006. [21] P.-H. Chen, C.-J. Lin, and B. Schölkopf, "A Tutorial on $\nu$ -Support Vector Machines," Applied Stochastic Models in Business and Industry, vol. 21, pp. 111-136, 2005. [22] F. Bach, D. Heckerman, and E. Horvitz, "Considering Cost Asymmetry in Learning Classifiers," J. Machine Learning Research, vol. 7, pp. 1713-1741, 2006.