Subscribe
Issue No.11 - November (2010 vol.22)
pp: 1623-1636
David Rebollo-Monedero , Technical University of Catalonia, Barcelona
Jordi Forné , Technical University of Catalonia, Barcelona
Josep Domingo-Ferrer , Rovira i Virgili University, Tarragona
ABSTRACT
t-Closeness is a privacy model recently defined for data anonymization. A data set is said to satisfy t-closeness if, for each group of records sharing a combination of key attributes, the distance between the distribution of a confidential attribute in the group and the distribution of the attribute in the entire data set is no more than a threshold t. Here, we define a privacy measure in terms of information theory, similar to t-closeness. Then, we use the tools of that theory to show that our privacy measure can be achieved by the postrandomization method (PRAM) for masking in the discrete case, and by a form of noise addition in the general case.
INDEX TERMS
t-Closeness, microdata anonymization, information theory, rate-distortion theory, PRAM, noise addition.
CITATION
David Rebollo-Monedero, Jordi Forné, Josep Domingo-Ferrer, "From t-Closeness-Like Privacy to Postrandomization via Information Theory", IEEE Transactions on Knowledge & Data Engineering, vol.22, no. 11, pp. 1623-1636, November 2010, doi:10.1109/TKDE.2009.190
REFERENCES
 [1] D. Rebollo-Monedero, J. Forné, and J. Domingo-Ferrer, "From $t$ -Closeness to PRAM and Noise Addition Via Information Theory," Proc. Int'l Conf. Privacy in Statistical Databases (PSD), Sept. 2008. [2] T. Dalenius, "Finding a Needle in a Haystack or Identifying Anonymous Census Records," J. Official Statistics, vol. 2, no. 3, pp. 329-336, 1986. [3] P. Samarati, "Protecting Respondents' Identities in Microdata Release," IEEE Trans. Knowledge and Data Eng., vol. 13, no. 6, pp. 1010-1027, Nov. 2001. [4] P. Samarati and L. Sweeney, "Protecting Privacy When Disclosing Information: $k$ -Anonymity and Its Enforcement Through Generalization and Suppression," technical report, SRI Int'l, 1998. [5] N. Li, T. Li, and S. Venkatasubramanian, "$t$ -Closeness: Privacy Beyond $k$ -Anonymity and $l$ -Diversity," Proc. IEEE Int. Conf. Data Eng. (ICDE), pp. 106-115, Apr. 2007. [6] J.M. Gouweleeuw, P. Kooiman, L.C.R.J. Willenborg, and P.-P. de Wolf, "Post Randomisation for Statistical Disclosure Control: Theory and Implementation," J. Official Statistics, vol. 14, no. 4, pp. 463-478, 1998. [7] P. Kooiman, L.C.R.J. Willenborg, and J.M. Gouweleeuw, "PRAM: A Method for Disclosure Limitation of Microdata," Research Report 9705, Statistics Netherlands, 1997. [8] P.-P. de Wolf, "Risk, Utility PRAM," Proc. Int'l Conf. Privacy in Statistical Databases (PSD), pp. 189-204, Dec. 2006. [9] D. Defays and P. Nanopoulos, "Panels of Enterprises and Confidentiality: The Small Aggregates Method," Proc. Symp. Design and Analysis of Longitudinal Surveys, pp. 195-204, 1993. [10] J. Domingo-Ferrer and J.M. Mateo-Sanz, "Practical Data-Oriented Microaggregation for Statistical Disclosure Control," IEEE Trans. Knowledge and Data Eng., vol. 14, no. 1, pp. 189-201, Jan. 2002. [11] J. Domingo-Ferrer and V. Torra, "Ordinal, Continuous and Heterogeneous $k$ -Anonymity Through Microaggregation," Data Mining and Knowledge Discovery, vol. 11, no. 2, pp. 195-212, 2005. [12] J. Domingo-Ferrer, F. Sebé, and A. Solanas, "A Polynomial-Time Approximation to Optimal Multivariate Microaggregation," Computers & Mathematics with Applications, vol. 55, no. 4, pp. 714-732, Feb. 2008. [13] T.M. Truta and B. Vinay, "Privacy Protection: $p$ -Sensitive $k$ -Anonymity Property," Proc. Int'l Workshop Privacy Data Management (PDM), pp. 94-103, 2006. [14] X. Sun, H. Wang, J. Li, and T.M. Truta, "Enhanced $p$ -Sensitive $k$ -Anonymity Models for Privacy Preserving Data Publishing," Trans. Data Privacy, vol. 1, no. 2, pp. 53-66, 2008. [15] A. Machanavajjhala, J. Gehrke, D. Kiefer, and M. Venkitasubramanian, "$l$ -Diversity: Privacy Beyond $k$ -Anonymity," Proc. IEEE Int'l. Conf. Data Eng. (ICDE), p. 24, Apr. 2006. [16] J. Domingo-Ferrer and V. Torra, "A Critique of $k$ -Anonymity and Some of Its Enhancements," Proc. Int'l Conf. Availability, Reliability and Security (ARES), Workshop Privacy, Security by means of Artificial Intelligence (PSAI), pp. 990-993, 2008. [17] T.M. Cover and J.A. Thomas, Elements of Information Theory, second ed., Wiley, 2006. [18] J. Brickell and V. Shmatikov, "The Cost of Privacy: Destruction of Data-Mining Utility in Anonymized Data Publishing," Proc. ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining (KDD), Aug. 2008. [19] A. Evfimievski, J. Gehrke, and R. Srikant, "Limiting Privacy Breaches in Privacy Preserving Data Mining," Proc. ACM Symp. Principles of Database Systems (PODS), pp. 211-222, 2003. [20] C.E. Shannon, "Communication Theory of Secrecy Systems," Bell System Technical J., vol. 28, no. 4, pp. 656-715, 1949. [21] A. Wyner, "The Wiretap Channel," Bell System Technical J., vol. 54, pp. 1355-1387, 1975. [22] P.M. Woodward, "Theory of Radar Information," Proc. London Symp. Information Theory, pp. 108-113, 1950. [23] D.V. Lindley, "On a Measure of the Information Provided by an Experiment," Annals of Math. Statistics, vol. 27, no. 4, pp. 986-1005, 1956. [24] A.G. de Waal and L.C.R.J. Willenborg, "Information Loss Through Global Recoding and Local Suppression," Netherlands Official Statistics, vol. 14, pp. 17-20, 1999. [25] L. Willenborg and T. DeWaal, Elements of Statistical Disclosure Control. Springer-Verlag, 2001. [26] E. Bertino and M.L. Damiani, "Foreword for the Special Issue of Selected Papers from the 1st ACM SIGSPATIAL Workshop on Security and Privacy in GIS and LBS," Trans. Data Privacy, vol. 2, no. 1, pp. 1-2, 2009. [27] C.E. Shannon, "Coding Theorems for a Discrete Source with a Fidelity Criterion," IRE Int'l Convention Record, vol. 7, part 4, pp. 142-163, 1959. [28] P. Venkitasubramaniam and L. Tong, "A Rate-Distortion Approach to Anonymous Networking," Proc. Allerton Conf. Communication, Control and Computing, Sept. 2007. [29] P. Venkitasubramaniam, T. He, and L. Tong, "Anonymous Networking Amidst Eavesdroppers," IEEE Trans. Information Theory, Special Issue on Information Theoretic Security, vol. 54, no. 6, pp. 2770-2784, June 2008. [30] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge Univ. Press, 2004. [31] J. Domingo-Ferrer, A. Martínez-Ballesté, J.M. Mateo-Sanz, and F. Sebé, "Efficient Multivariate Data-Oriented Microaggregation," VLDB J., vol. 15, no. 4, pp. 355-369, 2006. [32] A. Hundepool, R. Ramaswamy, P.-P. DeWolf, L. Franconi, R. Brand, and J. Domingo-Ferrer, "$\mu$ -ARGUS Version 4.1 Software and User's Manual," http://neon.vb.cbs.nlcasc, 2007. [33] M. Templ, "Statistical Disclosure Control for Microdata Using the R-Package SdcMicro," Trans. Data Privacy, vol. 1, no. 2, pp. 67-85, http://cran.r-project.org/web/packagessdcMicro , 2008.