The Community for Technology Leaders
Green Image
Issue No. 06 - June (2018 vol. 30)
ISSN: 1041-4347
pp: 1164-1177
Zhizhou Yin , School of Information Technologies, University of Sydney, Camperdown, NSW, Australia
Fei Wang , School of Information Technologies, University of Sydney, Camperdown, NSW, Australia
Wei Liu , Advanced Analytics Institute, University of Technology, Sydney, Ultimo, NSW, Australia
Sanjay Chawla , Qatar Computing Research Institute (QCRI), HBKU, Doha, Qatar
ABSTRACT
Adversarial learning is the study of machine learning techniques deployed in non-benign environments. Example applications include classification for detecting spam, network intrusion detection, and credit card scoring. In fact, as the use of machine learning grows in diverse application domains, the possibility for adversarial behavior is likely to increase. When adversarial learning is modelled in a game-theoretic setup, the standard assumption about the adversary (player) behavior is the ability to change all features of the classifiers (the opponent player) at will. The adversary pays a cost proportional to the size of the “attack”. We refer to this form of adversarial behavior as a dense feature attack. However, the aim of an adversary is not just to subvert a classifier but carry out data transformation in a way such that spam continues to remain effective. We demonstrate that an adversary could potentially achieve this objective by carrying out a sparse feature attack. We design an algorithm to show how a classifier should be designed to be robust against sparse adversarial attacks. Our main insight is that sparse feature attacks are best defended by designing classifiers which use $_$\ell _{1}$_$ regularizers.
INDEX TERMS
Games, Robustness, Data models, Electronic mail, Game theory, Gallium nitride, Transforms
CITATION

Z. Yin, F. Wang, W. Liu and S. Chawla, "Sparse Feature Attacks in Adversarial Learning," in IEEE Transactions on Knowledge & Data Engineering, vol. 30, no. 6, pp. 1164-1177, 2018.
doi:10.1109/TKDE.2018.2790928
929 ms
(Ver 3.3 (11022016))