This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
An Adaptive Fusion Algorithm for Spam Detection
PrePrint
ISSN: 1541-1672
Congfu Xu, Institute of Artificial Intelligence College of Computer Science, Hangzhou
Baojun Su, Institute of Artificial Intelligence College of Computer Science, Hangzhou
Yunbiao Cheng, Institute of Artificial Intelligence College of Computer Science, Zhejiang University, Hangzhou
Weike Pan, College of Computer Science and Software Engineering, Shenzhen
Spam detection has become a critical component in various online systems, like Email services, advertising engines, social media sites, etc. Diversity and dynamics are two main characteristics of spams, while one single online learner as deployed by many commercial systems is usually not sufficient to capture different aspects of spams, and thus may fail to learn the model parameters accurately. In this paper, we take Email services as an example, and present an adaptive fusion algorithm for spam detection (AFSD), which is a general content-based approach and can be applied to non-Email spam detection tasks with little additional effort. In our proposed algorithm, we (1) use n-grams of non-tokenized text strings to represent an Email, (2) introduce a link function in order to convert the prediction scores of online learners to be more comparable ones, (3) train the online learners in a mistake-driven manner via “thick thresholding” to obtain high competitive online learners, and (4) design update rules to adaptively integrate the online learners to capture different aspects of spams. We study the prediction performance of AFSD on five public competition datasets and one industry dataset, and observe that our algorithm achieves significantly better results than several state-of-the-art approaches, including the champion solutions of the corresponding competitions.
Citation:
Congfu Xu, Baojun Su, Yunbiao Cheng, Weike Pan, "An Adaptive Fusion Algorithm for Spam Detection," IEEE Intelligent Systems, 18 July 2013. IEEE computer Society Digital Library. IEEE Computer Society, <http://doi.ieeecomputersociety.org/10.1109/MIS.2013.54>
Usage of this product signifies your acceptance of the Terms of Use.