The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.07 - July (2009 vol.21)
pp: 1073-1087
Hekang Chen , Fudan University, Shanghai
Xiaokui Xiao , Chinese University of Hong Kong, Hong Kong
Yufei Tao , Chinese University of Hong Kong, Hong Kong
Donghui Zhang , Northeastern University, Boston
ABSTRACT
Generalization is a well-known method for privacy preserving data publication. Despite its vast popularity, it has several drawbacks such as heavy information loss, difficulty of supporting marginal publication, and so on. To overcome these drawbacks, we develop ANGEL,1 a new anonymization technique that is as effective as generalization in privacy protection, but is able to retain significantly more information in the microdata. ANGEL is applicable to any monotonic principles (e.g., l-diversity, t-closeness, etc.), with its superiority (in correlation preservation) especially obvious when tight privacy control must be enforced. We show that ANGEL lends itself elegantly to the hard problem of marginal publication. In particular, unlike generalization that can release only restricted marginals, our technique can be easily used to publish any marginals with strong privacy guarantees.
INDEX TERMS
Privacy, generalization, ANGEL.
CITATION
Hekang Chen, Xiaokui Xiao, Yufei Tao, Donghui Zhang, "ANGEL: Enhancing the Utility of Generalization for Privacy Preserving Publication", IEEE Transactions on Knowledge & Data Engineering, vol.21, no. 7, pp. 1073-1087, July 2009, doi:10.1109/TKDE.2009.65
REFERENCES
[1] C.C. Aggarwal, “On k-Anonymity and the Curse of Dimensionality,” Proc. Int'l Conf. Very Large Data Bases (VLDB), pp. 901-909, 2005.
[2] C.C. Aggarwal and P.S. Yu, “A Condensation Approach to Privacy Preserving Data Mining,” Proc. Int'l Conf. Extending Database Technology (EDBT), pp. 183-199, 2004.
[3] G. Aggarwal, T. Feder, K. Kenthapadi, S. Khuller, R. Panigrahy, D. Thomas, and A. Zhu, “Achieving Anonymity via Clustering,” Proc. ACM Symp. Principles of Database Systems (PODS), pp. 153-162, 2006.
[4] G. Aggarwal, T. Feder, K. Kenthapadi, R. Motwani, R. Panigrahy, D. Thomas, and A. Zhu, “Anonymizing Tables,” Proc. Int'l Conf. Database Theory (ICDT), pp. 246-258, 2005.
[5] R. Agrawal, J. Kiernan, R. Srikant, and Y. Xu, “Hippocratic Databases,” Proc. Int'l Conf. Very Large Data Bases (VLDB), pp. 143-154, 2002.
[6] R. Agrawal and R. Srikant, “Privacy-Preserving Data Mining,” Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 439-450, 2000.
[7] R. Bayardo and R. Agrawal, “Data Privacy through Optimal k-Anonymization,” Proc. Int'l Conf. Data Eng. (ICDE), pp. 217-228, 2005.
[8] E. Bertino, C. Bettini, E. Ferrari, and P. Samarati, “An Access Control Model Supporting Periodicity Constraints and Temporal Reasoning,” ACM Trans. Database Systems (TODS), vol. 23, no. 3 pp. 231-285, 1998.
[9] E. Bertino and E. Ferrari, “Secure and Selective Dissemination of XML Documents,” ACM Trans. Information and System Security, vol. 5, no. 3, pp. 290-331, 2002.
[10] A. Blum, C. Dwork, F. McSherry, and K. Nissim, “Practical Privacy: The Sulq Framework,” Proc. ACM Symp. Principles of Database Systems (PODS), pp. 128-138, 2005.
[11] B.-C. Chen, R. Ramakrishnan, and K. LeFevre, “Privacy Skyline: Privacy with Multidimensional Adversarial Knowledge,” Proc. Int'l Conf. Very Large Data Bases (VLDB), pp. 770-781, 2007.
[12] Y. Du, T. Xia, Y. Tao, D. Zhang, and F. Zhu, “On Multidimensional $k$ -Anonymity with Local Recoding Generalization,” Proc. Int'l Conf. Data Eng. (ICDE), pp. 1422-1424, 2007.
[13] C. Dwork, F. McSherry, K. Nissim, and A. Smith, “Calibrating Noise to Sensitivity in Private Data Analysis,” Proc. Theory of Cryptography Conf. (TCC), pp. 265-284, 2006.
[14] A.V. Evfimievski, J. Gehrke, and R. Srikant, “Limiting Privacy Breaches in Privacy Preserving Data Mining,” Proc. ACM Symp. Principles of Database Systems (PODS), pp. 211-222, 2003.
[15] B.C.M. Fung, K. Wang, and P.S. Yu, “Top-Down Specialization for Information and Privacy Preservation,” Proc. Int'l Conf. Data Eng. (ICDE), pp. 205-216, 2005.
[16] G. Ghinita, P. Karras, P. Kalnis, and N. Mamoulis, “Fast Data Anonymization with Low Information Loss,” Proc. Int'l Conf. Very Large Data Bases (VLDB), pp. 758-769, 2007.
[17] T. Iwuchukwu and J.F. Naughton, “k-Anonymization as Spatial Indexing: Toward Scalable and Incremental Anonymization,” Proc. Int'l Conf. Very Large Data Bases (VLDB), pp.746-757, 2007.
[18] V. Iyengar, “Transforming Data to Satisfy Privacy Constraints,” Proc. ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 279-288, 2002.
[19] W. Jiang and C. Clifton, “A Secure Distributed Framework for Achieving k-Anonymity,” The VLDB J., vol. 15, no. 4, pp. 316-333, 2006.
[20] D. Kifer and J. Gehrke, “Injecting Utility into Anonymized Data sets,” Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 217-228, 2006.
[21] K. LeFevre, D. DeWitt, and R. Ramakrishnan, “Workload-Aware Anonymization,” Proc. ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, 2006.
[22] K. LeFevre, D.J. DeWitt, and R. Ramakrishnan, “Incognito: Efficient Full-Domain $k$ -Anonymity,” Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 49-60, 2005.
[23] K. LeFevre, D.J. DeWitt, and R. Ramakrishnan, “Mondrian Multidimensional $k$ -Anonymity,” Proc. Int'l Conf. Data Eng. (ICDE), pp. 277-286, 2006.
[24] N. Li, T. Li, and S. Venkatasubramanian, “t-Closeness: Privacy beyond k-Anonymity and l-Diversity,” Proc. Int'l Conf. Data Eng. (ICDE), pp. 106-115, 2007.
[25] A. Machanavajjhala, J. Gehrke, D. Kifer, and M. Venkitasubramaniam, “l-Diversity: Privacy beyond k-Anonymity,” Proc. Int'l Conf. Data Eng. (ICDE), 2006.
[26] D. Martin, D. Kifer, A. Machanavajjhala, J. Gehrke, and J. Halpern, “Worst-Case Background Knowledge in Privacy,” Proc. Int'l Conf. Data Eng. (ICDE), 2007.
[27] A. Meyerson and R. Williams, “On the Complexity of Optimal k-Anonymity,” Proc. ACM Symp. Principles of Database Systems (PODS), pp. 223-228, 2004.
[28] S.U. Nabar, B. Marthi, K. Kenthapadi, N. Mishra, and R. Motwani, “Towards Robustness in Query Auditing,” Proc. Int'l Conf. Very Large Data Bases (VLDB), pp. 151-162, 2006.
[29] M.E. Nergiz, M. Atzori, and C. Clifton, “Hiding the Presence of Individuals from Shared Databases,” Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 665-676, 2007.
[30] H. Park and K. Shim, “Approximate Algorithms for k-Anonymity,” Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 67-78, 2007.
[31] V. Rastogi, S. Hong, and D. Suciu, “The Boundary between Privacy and Utility in Data Publishing,” Proc. Int'l Conf. Very Large Data Bases (VLDB), pp. 531-542, 2007.
[32] J. Rothe, “Some Facets of Complexity Theory and Cryptography: A Five-Lecture Tutorial,” ACM Computing Surveys, vol. 34, no. 4 pp. 504-549, 2002.
[33] P. Samarati, “Protecting Respondents' Identities in Microdata Release,” IEEE Trans. Knowledge and Data Eng., vol. 13, no. 6, pp.1010-1027, Nov./Dec. 2001.
[34] L. Sweeney, “Achieving $k$ -Anonymity Privacy Protection Using Generalization and Suppression,” Int'l J. Uncertainty, Fuzziness, and Knowledge-Based Systems, vol. 10, no. 5, pp. 571-588, 2002.
[35] L. Sweeney, “k-Anonymity: A Model for Protecting Privacy,” Int'l J. Uncertainty, Fuzziness, and Knowledge-Based Systems, vol. 10, no. 5, pp. 557-570, 2002.
[36] J. Vaidya and C. Clifton, “Privacy-Preserving $k$ -Means Clustering over Vertically Partitioned Data,” Proc. ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 206-215, 2003.
[37] K. Wang and B.C.M. Fung, “Anonymizing Sequential Releases,” Proc. ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 414-423, 2006.
[38] R.C.-W. Wong, A.W.-C. Fu, K. Wang, and J. Pei, “Minimality Attack in Privacy Preserving Data Publishing,” Proc. Int'l Conf. Very Large Data Bases (VLDB), pp. 543-554, 2007.
[39] R.C.-W. Wong, J. Li, A.W.-C. Fu, and K. Wang, “(Alpha, k)-Anonymity: An Enhanced k-Anonymity Model for Privacy Preserving Data Publishing,” Proc. ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 754-759, 2006.
[40] X. Xiao and Y. Tao, “Anatomy: Simple and Effective Privacy Preservation,” Proc. Int'l Conf. Very Large Data Bases (VLDB), pp.139-150, 2006.
[41] X. Xiao and Y. Tao, “Personalized Privacy Preservation,” Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 229-240, 2006.
[42] X. Xiao and Y. Tao, “$m$ -Invariance: Towards Privacy Preserving Re-Publication of Dynamic Data Sets,” Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 689-700, 2007.
[43] J. Xu, W. Wang, J. Pei, X. Wang, B. Shi, and A.W.-C. Fu, “Utility-Based Anonymization Using Local Recoding,” Proc. ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 785-790, 2006.
[44] C. Yao, X.S. Wang, and S. Jajodia, “Checking for $k$ -Anonymity Violation by Views,” Proc. Int'l Conf. Very Large Data Bases (VLDB), pp. 910-921, 2005.
[45] Q. Zhang, N. Koudas, D. Srivastava, and T. Yu, “Aggregate Query Answering on Anonymized Tables,” Proc. Int'l Conf. Data Eng. (ICDE), pp. 116-125, 2007.
32 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool