Subscribe
Issue No.11 - November (2009 vol.31)
pp: 2088-2092
Mingrui Wu , Yahoo! Inc., Sunnyvale
Jieping Ye , Arizona State University, Tempe
ABSTRACT
We present a small sphere and large margin approach for novelty detection problems, where the majority of training data are normal examples. In addition, the training data also contain a small number of abnormal examples or outliers. The basic idea is to construct a hypersphere that contains most of the normal examples, such that the volume of this sphere is as small as possible, while at the same time the margin between the surface of this sphere and the outlier training data is as large as possible. This can result in a closed and tight boundary around the normal data. To build such a sphere, we only need to solve a convex optimization problem that can be efficiently solved with the existing software packages for training \nu\hbox{-}Support Vector Machines. Experimental results are provided to validate the effectiveness of the proposed algorithm.
INDEX TERMS
Novelty detection, one-class classification, support vector machine, kernel methods.
CITATION
Mingrui Wu, Jieping Ye, "A Small Sphere and Large Margin Approach for Novelty Detection Using Training Data with Outliers", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.31, no. 11, pp. 2088-2092, November 2009, doi:10.1109/TPAMI.2009.24
REFERENCES
 [1] C. Campbell and K.P. Bennett, “A Linear Programming Approach to Novelty Detection,” Y. Weiss, B. Schölkopf, and J. Platt, eds., Advances in Neural Information Processing Systems 12, MIT Press, 2000. [2] L.J. Cao, H.P. Lee, and W.K. Chong, “Modified Support Vector Novelty Detector Using Training Data with Outliers,” Pattern Recognition Letters, vol. 24, pp. 2479-2487, 2003. [3] C.-C. Chang and C.-J. Lin, “LIBSVM: A Library for Support Vector Machines,” the LIBSVM software package is available at http://www.csie.ntu.edu.tw/~cjlinlibsvm, 2001. [4] C.-C. Chang and C.-J. Lin, “Training $\nu$ -Support Vector Classifiers: Theory and Algorithms,” Neural Computation, vol. 14, pp. 1959-1977, 2002. [5] O. Chapelle, V. Vapnik, O. Bousquet, and S. Mukherjee, “Choosing Multiple Parameters for Support Vector Machines,” Machine Learning, vol. 46, nos. 1-3, pp. 131-159, 2002. [6] N.V. Chawla, K.W. Bowyer, L.O. Hall, and W.P. Kegelmeyer, “SMOTE: Synthetic Minority Over-Sampling Technique,” J. Artificial Intelligence Research, vol. 16, pp. 321-357, 2002. [7] W. Karush, “Minima of Functions of Several Variables with Inequalities as Side Constraints,” master's thesis, Dept. of Math., Univ. of Chicago, 1939. [8] M. Kubat and S. Matwin, “Addressing the Curse of Imbalanced Training Sets: One-Sided Selection,” Proc. 14th Int'l Conf. Machine Learning, 1997. [9] H.W. Kuhn and A.W. Tucker, “Nonlinear Programming,” Proc. Second Berkeley Symp. Math. Statistics and Probabilistics, pp. 481-492, 1951. [10] Y. Lin, Y. Lee, and G. Wahba, “Support Vector Machine for Classification in Nonstandard Situations,” Machine Learning, vol. 46, pp. 191-202, 2002. [11] P.B. Nair, A. Choudhury, and A.J. Keane, “Bayesian Framework for Least Squares Support Vector Machine Classifiers, Gaussian Processes and Kernel Fisher Discriminant Analysis,” Neural Computation, vol. 15, pp.1115-1148, 2002. [12] E. Pekalska, D.M. Tax, and R.P.W. Duin, “One-Class LP Classifier for Dissimilarity Representations,” Advances in Neural Information Processing Systems 12, S. Thrun, S. Becker, and K. Obermayer, eds., MIT Press, 2003. [13] S. Roberts and L. Tarassenko, “A Probabilistic Resource Allocation Network for Novelty Detection,” Neural Computation, vol. 6, pp. 270-284, 1994. [14] B. Schölkopf and A.J. Smola, Learning with Kernels. MIT Press, 2002. [15] C.D. Scott and R.D. Nowak, “Learning Minimum Volume Sets,” J. Machine Learning Research, vol. 7, pp. 665-704, 2006. [16] I. Steinwart, D. Hush, and C. Scovel, “A Classification Framework for Anomaly Detection,” J. Machine Learning Research, vol. 6, pp. 211-232, 2005. [17] J.A.K. Suykens, T.V. Gestel, J.D. Brabanter, B.D. Moor, and J. Vandewalle, Least Squares Support Vector Machines. World Scientific, 2002. [18] D.M.J. Tax and R.P.W. Duin, “Support Vector Data Description,” Machine Learning, vol. 54, pp. 45-66, 2004. [19] G.G. Towel, “Local Expert Autoassociators for Anomaly Detection,” Proc. 17th Int'l Conf. Machine Learning, 2000. [20] V. Vapnik, The Nature of Statistical Learning Theory. Springer Verlag, 1995. [21] V. Vapnik and O. Chapelle, “Bounds on Error Expectation for Support Vector Machines,” Neural Computation, vol. 12, 2000. [22] K. Veropoulos, C. Campbell, and N. Cristianini, “Controlling the Sensitivity of Support Vector Machines,” Proc. 16th Int'l Conf. Artificial Intelligence and Statistics, 1999. [23] R. Vert and J.P. Vert, “Consistency and Convergence Rates of One-Class SVM and Related Algorithms,” J. Machine Learning Research, vol. 7, pp. 817-854, 2006. [24] G. Wu and E.Y. Chang, “Class-Boundary Alignment for Imbalanced Dataset Learning,” Proc. Int'l Conf. Machine Learning Workshop Learning from Imbalanced Datasets, 2003.