Issue No. 08 - August (2011 vol. 23)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2010.247
Xiaokui Xiao , Nanyang Technological University, Singapore
Guozhang Wang , Cornell University, Ithaca
Johannes Gehrke , Cornell University, Ithaca
Privacy-preserving data publishing has attracted considerable research interest in recent years. Among the existing solutions, \epsilon-differential privacy provides the strongest privacy guarantee. Existing data publishing methods that achieve \epsilon-differential privacy, however, offer little data utility. In particular, if the output data set is used to answer count queries, the noise in the query answers can be proportional to the number of tuples in the data, which renders the results useless. In this paper, we develop a data publishing technique that ensures \epsilon-differential privacy while providing accurate answers for range-count queries, i.e., count queries where the predicate on each attribute is a range. The core of our solution is a framework that applies wavelet transforms on the data before adding noise to it. We present instantiations of the proposed framework for both ordinal and nominal data, and we provide a theoretical analysis on their privacy and utility guarantees. In an extensive experimental study on both real and synthetic data, we show the effectiveness and efficiency of our solution.
Privacy-preserving data publishing, differential privacy, wavelets.
G. Wang, J. Gehrke and X. Xiao, "Differential Privacy via Wavelet Transforms," in IEEE Transactions on Knowledge & Data Engineering, vol. 23, no. , pp. 1200-1214, 2010.