|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI'06)
Personalized Spam Filtering with Semi-supervised Classifier Ensemble
Hong Kong, China
December 18-December 22
ISBN: 0-7695-2747-7
| ASCII Text | x | ||
| Victor Cheng, C.H. Li, "Personalized Spam Filtering with Semi-supervised Classifier Ensemble," Web Intelligence, IEEE / WIC / ACM International Conference on, pp. 195-201, 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI'06), 2006. | |||
| BibTex | x | ||
| @article{ 10.1109/WI.2006.132, author = {Victor Cheng and C.H. Li}, title = {Personalized Spam Filtering with Semi-supervised Classifier Ensemble}, journal ={Web Intelligence, IEEE / WIC / ACM International Conference on}, volume = {0}, year = {2006}, isbn = {0-7695-2747-7}, pages = {195-201}, doi = {http://doi.ieeecomputersociety.org/10.1109/WI.2006.132}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - CONF JO - Web Intelligence, IEEE / WIC / ACM International Conference on TI - Personalized Spam Filtering with Semi-supervised Classifier Ensemble SN - 0-7695-2747-7 SP195 EP201 A1 - Victor Cheng, A1 - C.H. Li, PY - 2006 KW - null VL - 0 JA - Web Intelligence, IEEE / WIC / ACM International Conference on ER - | |||
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/WI.2006.132
The proliferation of unsolicited emails, also known as spam, poses significant burden to email users worldwide. Recent researches on spam filtering have shown that high accuracies can be obtained if labeled emails examples are available from the particular user of the spam filter. However, the time consuming process of providing personalized labeled training examples is often inconvenient or impossible due to privacy issues. In this paper, a semi-supervised personalized spam filter based on classifier ensemble is proposed that classifies user?s emails accurately by learning on both generic labeled emails and personalized unlabeled emails. The proposed multi-stage classification process begins learning a SVM model from labeled generic data. Unlabeled user?s emails are then fed to this SVM to generate personalized labeled data for constructing personalized naive Bayes classifiers. Furthermore, some personalized labeled examples are generated by exploiting rare word distributions and then fed into a semi-supervised classifier. The multi-stage results are integrated with SVMs learned from generic labeled emails to produce the final classification results. Experimental results show that the proposed approaches can significantly increases the classification accuracy in spam filtering.
Citation:
Victor Cheng, C.H. Li, "Personalized Spam Filtering with Semi-supervised Classifier Ensemble," wi, pp.195-201, 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI'06), 2006
Usage of this product signifies your acceptance of the Terms of Use.
