2012 Seventh International Conference on Availability, Reliability and Security (2008)
Mar. 4, 2008 to Mar. 7, 2008
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ARES.2008.136
Spam is considered an invasion of privacy. Its changeable structures and variability raise the need for new spam classification techniques. The present study proposes using Bayesian Additive Regression Trees (BART) for spam classification and evaluates its performance against other classification methods, including Logistic Regression, Support Vector Machines, Classification and Regression Trees, Neural Networks, Random Forests, and Naive Bayes. BART in its original form is not designed for such problems, hence we modify BART and make it applicable to classification problems. We evaluate the classifiers using three spam datasets; Ling-Spam, PU1, and Spambase to determine the predictive accuracy and the false positive rate.
BART, CART, classification, logistic regression, NNet, random forests, spam, SVM
Xinlei Wang, Suku Nair, Saeed Abu-Nimeh, Dario Nappa, "Bayesian Additive Regression Trees-Based Spam Detection for Enhanced Email Privacy", 2012 Seventh International Conference on Availability, Reliability and Security, vol. 00, no. , pp. 1044-1051, 2008, doi:10.1109/ARES.2008.136