Subscribe

Issue No.05 - May (2013 vol.25)

pp: 1083-1096

Shan-Hung Wu , National Tsing Hua University, Hsinchu

Keng-Pei Lin , National Sun Yat-sen University, Kaohsiung

Hao-Heng Chien , National Tsing Hua University, Hsinchu

Chung-Min Chen , Telcordia Technologies Inc., Piscataway

Ming-Syan Chen , National Taiwan University, Taipei

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2012.46

ABSTRACT

The Support Vector Machines (SVMs) have been widely used for classification due to its ability to give low generalization error. In many practical applications of classification, however, the wrong prediction of a certain class is much severer than that of the other classes, making the original SVM unsatisfactory. In this paper, we propose the notion of Asymmetric Support Vector Machine (ASVM), an asymmetric extension of the SVM, for these applications. Different from the existing SVM extensions such as thresholding and parameter tuning, ASVM employs a new objective that models the imbalance between the costs of false predictions from different classes in a novel way such that user tolerance on false-positive rate can be explicitly specified. Such a new objective formulation allows us of obtaining a lower false-positive rate without much degradation of the prediction accuracy or increase in training time. Furthermore, we show that the generalization ability is preserved with the new objective. We also study the effects of the parameters in ASVM objective and address some implementation issues related to the Sequential Minimal Optimization (SMO) to cope with large-scale data. An extensive simulation is conducted and shows that ASVM is able to yield either noticeable improvement in performance or reduction in training time as compared to the previous arts.

INDEX TERMS

Support vector machines, Accuracy, Training, Postal services, Tuning, Testing, Cancer, low false-positive learning, Support Vector Machine, classification

CITATION

Shan-Hung Wu, Keng-Pei Lin, Hao-Heng Chien, Chung-Min Chen, Ming-Syan Chen, "On Generalizable Low False-Positive Learning Using Asymmetric Support Vector Machines",

*IEEE Transactions on Knowledge & Data Engineering*, vol.25, no. 5, pp. 1083-1096, May 2013, doi:10.1109/TKDE.2012.46REFERENCES

- [1] I. Androutsopoulos, J. Koutsias, K. Chandrinos, and C. Spyropoulos, "An Experimental Comparison of Naive Bayesian and Keyword-Based Anti-Spam Filtering with Personal E-Mail Messages,"
Proc. 23rd Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR), 2000.- [2] A. Asuncion and D. Newman,
UCI Machine Learning Repository, http://www.ics.uci.edu/ mlearnMLRepository.html , 2007.- [3] D. Barbara, N. Wu, and S. Jajodia, "Detecting Novel Network Intrusions Using Bayes Estimators,"
Proc. First SIAM Conf. Data Mining (SDM), 2001.- [4] P. Bartlett and J. Shawe-Taylor, "Generalization Performance of Support Vector Machines and Other Pattern Classifiers,"
Advances in Kernel Methods: Support Vector Learning, MIT Press, 1998.- [5] P. Bartlett, "The Sample Complexity of Pattern Classification with Neuralnetworks: The Size of the Weights is More Important than the Size of Thenetwork,"
IEEE Trans. Information Theory, vol. 44, no. 2, pp. 525-536, Mar. 1998.- [6] A. Ben-Hur, D. Horn, H. Siegelmann, and V. Vapnik, "Support Vector Clustering,"
J. Machine Learning Research, vol. 2, pp. 125-137, 2001.- [7] P. Boykin and V. Roychowdhury, "Leveraging Social Networks to Fight Spam,"
Computer, vol. 38, pp. 61-68, 2005.- [8] A. Bratko, G. Cormack, B. Filipic, T. Lynam, and B. Zupan, "Spam Filtering using Statistical Data Compression Models,"
J. Machine Learning Research, vol. 7, pp. 2673-2698, 2006.- [9] L. Breiman,
Classification and Regression Trees. Chapman & Hall, 1998.- [10] C. Burges, "A Tutorial on Support Vector Machines for Pattern Recognition,"
Data Mining and Knowledge Discovery, vol. 2, no. 2, pp. 121-167, 1998.- [11] X. Carreras and L. Marquez, "Boosting Trees for Anti-Spam Email Filtering,"
Proc. Fourth Int'l Conf. Recent Advances in Natural Language Processing, 2001.- [12] C.-C. Chang and C.-J. Lin, "LIBSVM: A Library for Support Vector Machines," software, http://www.csie.ntu.edu.tw/~cjlinlibsvm, 2001.
- [13] H. Cheng, X. Cai, X. Chen, L. Hu, and X. Lou, "Computer-Aided Detection and Classification of Microcalcifications in Mammograms: A Survey,"
Pattern Recognition, vol. 36, no. 12, pp. 2967-2991, 2003.- [14] G. Cormack and T. Lynam, "Overview of the Trec 2005 Spam Evaluation Track,"
Proc. 14th Text REtrieval Conf. (TREC '05), 2005.- [15] C. Cortes and V. Vapnik, "Support Vector Networks,"
Machine Learning, vol. 20, pp. 273-297, 1995.- [16] H. Drucker, D. Wu, and V. Vapnik, "Support Vector Machines for Spam Categorization,"
IEEE Trans. Neural Networks, vol. 10, no. 5, pp. 1048-1054, Sept. 1999.- [17] J. Goodman, G. Cormack, and D. Heckerman, "Spam and the Ongoing Battle for the Inbox,"
Comm. ACM, vol. 50, no. 2, pp. 24-33, Feb. 2007.- [18] C.-W. Hsu, C.-C. Chang, and C.-J. Lin, "A Practical Guide to Support Vector Classification," technical report, http://www.csie.ntu.edu.tw/~cjlinlibsvm, 2003.
- [19] S. Inalou and S. Kasaei, "Adaboost-Based Face Detection in Color Images with Low False Alarm,"
Proc. Second Int'l Conf. Computer Modeling and Simulation, 2010.- [20] J. Kivinen, A. Smola, and R. Williamson, "Online Learning with Kernels,"
Advances in Neural Information Processing Systems, vol. 14, pp. 785-793, MIT Press, 2002.- [21] A. Kolcz and J. Alspector, "SVM-Based Filtering of E-Mail Spam with Content-Specific Misclassification Costs,"
Proc. Workshop Text Mining—IEEE Int'l Conf. Data (TextDM), 2001.- [22] H.-Y. Lam and D.-Y. Yeung, "A Learning Approach to Spam Detection Based on Social Networks,"
Proc. Fourth Conf. Email and Anti-Spam (CEAS), 2007.- [23] Y.-F. Li, J.T. Kwok, and Z.-H. Zhou, "Cost-Sensitive Semi-Supervised Support Vector Machine,"
Proc. 24th AAAI Conf. Artificial Intelligence (AAAI), 2010.- [24] T. Lynam, G. Cormack, and D. Cheriton, "On-Line Spam Filter Fusion,"
Proc. 29th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR), pp. 123-130. 2006,- [25] H. Masnadi-Shirazi and N. Vasconcelos, "Asymmetric Boosting,"
Proc. 24th Int'l Conf. Machine Learning (ICML), 2007.- [26] J. Platt, "Sequenital Minimal Optimization: A Fast Algorithm for Training Support Vector Machines,"
Advances in Kernel Methods: Support Vector Learning, MIT Press, 1998.- [27] D. Prokhorov,
IJCNN 2001 Neural Network Competition, Ford Research Laboratory, 2001.- [28] M. Sahami, S. Dumais, D. Heckerman, and E. Horvitz, "A Bayesian Approach to Filtering Junk E-Mail," Technical Report WS-98-05, AAAI, 1998.
- [29] K. Schneider, "A Comparison of Event Models for Naive Bayes Anti-Spam E-Mail Filtering,"
Proc. 11th Conf. the European Chapter of the Assoc. for Computational Linguistics, 2003.- [30] B. Scholkopf, J. Platt, J. Shawe-Taylor, A. Smola, and R.C. Williamson, "Estimating the Support of a High-Dimensional Distribution,"
Neural Computation, vol. 13, pp. 1443-1471, 2001.- [31] B. Scholkopf and A. Smola,
Learning with Kernels:: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, 2002.- [32] D. Sculley and G. Wachman, "Relaxed Online Support Vector Machines for Spam Filtering,"
Proc. 30th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR), 2007.- [33] J. Shawe-Taylor, P. Bartlett, R. Williamson, and M. Anthony, "Structural Risk Minimization Over Data-Dependent Hierarchies,"
IEEE Trans. Information Theory, vol. 44, no. 5, pp. 1926-1940, Sept. 1998.- [34] D. Song and Y. Xu, "A Low False Negative Filter for Detecting Rare Bird Species from Short Video Segments Using a Probable Observation Data Set-Based EKF Method,"
IEEE Trans. Image Processing, vol. 19, no. 9, pp. 2321-2331, Sept. 2010.- [35] V. Vapnik,
Statistical Learning Theory. Wiley, 1998.- [36] V. Vapnik and A. Chervonenkis, "On the Uniform Convergence of Relative Frequencies of Events to Their Probabilities,"
Theory of Probability and Its Applications, vol. 16, no. 2, pp. 264-280, 1971.- [37] P. Viola and M. Jones, "Fast and Robust Classification Using Asymmetric Adaboost and a Detector Cascade,"
Proc. Neural Information Processing Systems Conf. (NIPS), 2002.- [38] J. Wu, M.D. Mullin, and J.M. Rehg, "Linear Asymmetric Classifier for Cascade Detectors,"
Proc. 22nd Int'l Conf. Machine Learning (ICML), 2005.- [39] W. Yih, J. Goodman, and G. Hulten, "Learning at Low False Positive Rates,"
Proc. Third Conf. Email and Anti-Spam (CEAS), 2006.- [40] B. Zheng, W. Qian, and L. Clarke, "Digital Mammography: Mixed Feature Neural Network with Spectralentropy Decision for Detection of Microcalcifications,"
IEEE Trans. Medical Imaging, vol. 15, no. 5, pp. 589-597, Oct. 1996.- [41] Z.-H. Zhou and X.-Y. Liu, "Training Cost-Sensitive Neural Networks with Methods Addressing the Class Imbalance Problem,"
IEEE Trans. Knowledge and Data Eng., vol. 18, no. 1, pp. 63-77, Jan. 2006. |