Sixth International Conference on Intelligent Systems Design and Applications (ISDA'06) Volume 3 Bayesian Chinese Spam Filter Based on Crossed N-gram Jinan, China October 16-October 18 ISBN: 0-7695-2528-8
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ISDA.2006.17
Naive Bayesian spam email filters are a wellknown and powerful type of filters that can easily be induced from a dataset of sample cases. However, the problem of segmenting words for Chinese email restricts its performance. In this paper, we present a Bayesian Chinese spam filter based on cross N-gram. This method does not need to carry on segmenting words for Chinese emails, so that it can avoid to be restricted by inaccurate words segmenting. It also needn?t to install segmenting word dictionary and is easy to install in the user terminal to construct an individualized spam filter since the space and time efficiency are improved. The restriction on independence assumption of naive bayes method is relaxed in some degree. The results of experiments show that the proposed method can acquire a high accuracy ratio at low cost.
Citation:
Jianshe Dong, Haixia Cao, Peng Liu, Li Ren, "Bayesian Chinese Spam Filter Based on Crossed N-gram," isda, vol. 3, pp.103-108, Sixth International Conference on Intelligent Systems Design and Applications (ISDA'06) Volume 3, 2006 Usage of this product signifies your acceptance of the Terms of Use. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||