2015 IEEE 31st International Conference on Data Engineering (ICDE) (2015)
Seoul, South Korea
April 13, 2015 to April 17, 2015
Erman Pattuk , University of Texas at Dallas, Richardson, USA
Murat Kantarcioglu , University of Texas at Dallas, Richardson, USA
Huseyin Ulusoy , University of Texas at Dallas, Richardson, USA
Bradley Malin , Vanderbilt University, Nashville, Tennessee USA
Big data will enable the development of novel services that enhance a company's market advantage, competition, or productivity. At the same time, the utilization of such a service could disclose sensitive data in the process, which raises significant privacy concerns. To protect individuals, various policies, such as the Code of Fair Information Practices, as well as recent laws require organizations to capture only the minimal amount of data necessary to support a service. While this is a notable goal, choosing the minimal data is a non-trivial process, especially while considering privacy and utility constraints. In this paper, we introduce a technique to minimize sensitive data disclosure by focusing on privacy-aware feature selection. During model deployment, the service provider requests only a subset of the available features from the client, such that it can produce results with maximal confidence, while minimizing its ability to violate a client's privacy. We propose an iterative approach, where the server requests information one feature at a time until the client-specified privacy budget is exhausted. The overall process is dynamic, such that the feature selected at each step depends on the previously selected features and their corresponding values. We demonstrate our technique with three popular classification algorithms and perform an empirical analysis over three real world datasets to illustrate that, in almost all cases, classifiers that select features using our strategy have the same error-rate as state-of-the art static feature selection methods that fail to preserve privacy.
Servers, Privacy, Data privacy, Decision trees, Measurement, Probability, Niobium
E. Pattuk, M. Kantarcioglu, H. Ulusoy and B. Malin, "Privacy-aware dynamic feature selection," 2015 IEEE 31st International Conference on Data Engineering (ICDE), Seoul, South Korea, 2015, pp. 78-88.