Issue No.11 - November (2011 vol.23)
Keng-Pei Lin , National Taiwan University, Taipei
Ming-Syan Chen , National Taiwan University, and Academia Sinica, Taipei
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2010.193
The support vector machine (SVM) is a widely used tool in classification problems. The SVM trains a classifier by solving an optimization problem to decide which instances of the training data set are support vectors, which are the necessarily informative instances to form the SVM classifier. Since support vectors are intact tuples taken from the training data set, releasing the SVM classifier for public use or shipping the SVM classifier to clients will disclose the private content of support vectors. This violates the privacy-preserving requirements for some legal or commercial reasons. The problem is that the classifier learned by the SVM inherently violates the privacy. This privacy violation problem will restrict the applicability of the SVM. To the best of our knowledge, there has not been work extending the notion of privacy preservation to tackle this inherent privacy violation problem of the SVM classifier. In this paper, we exploit this privacy violation problem, and propose an approach to postprocess the SVM classifier to transform it to a privacy-preserving classifier which does not disclose the private content of support vectors. The postprocessed SVM classifier without exposing the private content of training data is called Privacy-Preserving SVM Classifier (abbreviated as PPSVC). The PPSVC is designed for the commonly used Gaussian kernel function. It precisely approximates the decision function of the Gaussian kernel SVM classifier without exposing the sensitive attribute values possessed by support vectors. By applying the PPSVC, the SVM classifier is able to be publicly released while preserving privacy. We prove that the PPSVC is robust against adversarial attacks. The experiments on real data sets show that the classification accuracy of the PPSVC is comparable to the original SVM classifier.
Privacy-preserving data mining, classification, support vector machines.
Keng-Pei Lin, Ming-Syan Chen, "On the Design and Analysis of the Privacy-Preserving SVM Classifier", IEEE Transactions on Knowledge & Data Engineering, vol.23, no. 11, pp. 1704-1717, November 2011, doi:10.1109/TKDE.2010.193