Issue No.07 - July (2010 vol.22)
Ninghui Li , Purdue University, West Lafayette
Tiancheng Li , Purdue University, West Lafayette
Suresh Venkatasubramanian , University of Utah, Salt Lake City
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2009.139
The k-anonymity privacy requirement for publishing microdata requires that each equivalence class (i.e., a set of records that are indistinguishable from each other with respect to certain “identifying” attributes) contains at least k records. Recently, several authors have recognized that k-anonymity cannot prevent attribute disclosure. The notion of \ell-diversity has been proposed to address this; \ell-diversity requires that each equivalence class has at least \ell well-represented (in Section 2) values for each sensitive attribute. In this paper, we show that \ell-diversity has a number of limitations. In particular, it is neither necessary nor sufficient to prevent attribute disclosure. Motivated by these limitations, we propose a new notion of privacy called “closeness.” We first present the base model t-closeness, which requires that the distribution of a sensitive attribute in any equivalence class is close to the distribution of the attribute in the overall table (i.e., the distance between the two distributions should be no more than a threshold t). We then propose a more flexible privacy model called (n,t)-closeness that offers higher utility. We describe our desiderata for designing a distance measure between two probability distributions and present two distance measures. We discuss the rationale for using closeness as a privacy measure and illustrate its advantages through examples and experiments.
Privacy preservation, data anonymization, data publishing, data security.
Ninghui Li, Tiancheng Li, Suresh Venkatasubramanian, "Closeness: A New Privacy Measure for Data Publishing", IEEE Transactions on Knowledge & Data Engineering, vol.22, no. 7, pp. 943-956, July 2010, doi:10.1109/TKDE.2009.139