The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.01 - January (2011 vol.23)
pp: 122-138
Nan Zhang , The George Washington University, Washington DC
ABSTRACT
We address issues related to the protection of private information in Online Analytical Processing (OLAP) systems, where a major privacy concern is the adversarial inference of private information from OLAP query answers. Most previous work on privacy-preserving OLAP focuses on a single aggregate function and/or addresses only exact disclosure, which eliminates from consideration an important class of privacy breaches where partial information, but not exact values, of private data is disclosed (i.e., partial disclosure). We address privacy protection against both exact and partial disclosure in OLAP systems with mixed aggregate functions. In particular, we propose an information-theoretic inference control approach that supports a combination of common aggregate functions (e.g., COUNT, SUM, MIN, MAX, and MEDIAN) and guarantees the level of privacy disclosure not to exceed thresholds predetermined by the data owners. We demonstrate that our approach is efficient and can be implemented in existing OLAP systems with little modification. It also satisfies the simulatable auditing model and leaks no private information through query rejections. Through performance analysis, we show that compared with previous approaches, our approach provides more effective privacy protection while maintaining a higher level of query-answer availability.
INDEX TERMS
Online analytical processing (OLAP), privacy, information theory.
CITATION
Nan Zhang, "Privacy-Preserving OLAP: An Information-Theoretic Approach", IEEE Transactions on Knowledge & Data Engineering, vol.23, no. 1, pp. 122-138, January 2011, doi:10.1109/TKDE.2010.25
REFERENCES
[1] N.R. Adam and J.C. Worthmann, "Security-Control Methods for Statistical Databases: A Comparative Study," ACM Computing Surveys, vol. 21, no. 4, pp. 515-556, 1989.
[2] R. Agrawal, A. Evfimievski, and R. Srikant, "Information Sharing Across Private Databases," Proc. ACM SIGMOD, pp. 86-97, 2003.
[3] R. Agrawal and R. Srikant, "Privacy-Preserving Data Mining," Proc. ACM SIGMOD, pp. 439-450, 2000.
[4] R. Agrawal, R. Srikant, and D. Thomas, "Privacy Preserving OLAP," Proc. ACM SIGMOD, pp. 251-262, 2005.
[5] L.L. Beck, "A Security Mechanism for Statistical Database," ACM Trans. Database Systems, vol. 5, no. 3, pp. 316-338, 1980.
[6] R. Brand, "Microdata Protection through Noise Addition," Inference Control in Statistical Databases: From Theory to Practice, vol. 2316, pp. 97-116, Springer, 2002.
[7] L. Brankovic, P. Norak, M. Miller, and G. Wrightson, "Usability of Compromise-Free Statistical Databases," Proc. Ninth Int'l Conf. Scientific and Statistical Database Management, pp. 144-154, 1997.
[8] B. Chen, K. LeFevre, and R. Ramakrishnan, "Privacy Skyline: Privacy with Multidimensional Adversarial Knowledge," Proc. 33rd Int'l Conf. Very Large Data Bases, pp. 770-781, 2007.
[9] F. Chin, "Security Problems on Inference Control for Sum, Max, and Min Queries," J. ACM, vol. 33, no. 3, pp. 451-464, 1986.
[10] F. Chin and G. Ozsoyoglu, "Auditing for Secure Statistical Databases," Proc. ACM '81 Conf., pp. 53-59, 1981.
[11] F. Chin and G. Ozsoyoglu, "Auditing and Inference Control in Statistical Databases," IEEE Trans. Software Eng., vol. 8, no. 6, pp. 574-582, Nov. 1982.
[12] T.M. Cover and J.A. Thomas, Elements of Information Theory. Wiley-Interscience, 1991.
[13] D. Dobkin, A.K. Jones, and R.J. Lipton, "Secure Databases: Protection Against User Influence," ACM Trans. Database Systems, vol. 4, no. 1, pp. 97-106, 1979.
[14] J. Domingo-Ferrer, Inference Control in Statistical Databases. Springer, 2002.
[15] J. Domingo-Ferrer and J.M. Mateo-Sanz, "Practical Data-Oriented Microaggregation for Statistical Disclosure Control," IEEE Trans. Knowledge and Data Eng., vol. 14, no. 1, pp. 189-201, Jan. 2002.
[16] C. Dwork, F. McSherry, K. Nissim, and A. Smith, "Calibrating Noise to Sensitivity in Private Data Analysis," Proc. Third Theory of Cryptography Conf., pp. 265-284, 2006.
[17] A. Evfimievski, J. Gehrke, and R. Srikant, "Limiting Privacy Breaches in Privacy Preserving Data Mining," Proc. 22nd ACM SIGACT-SIGMOD-SIGART Symp. Principles of Database Systems, pp. 211-222, 2003.
[18] R. Gopal, R. Garfinkel, and P. Goes, "Confidentiality via Camouflage: The CVC Approach to Disclosure Limitation When Answering Queries to Databases," Operations Research, vol. 50, pp. 501-516, 2002.
[19] P.P. Griffiths and B.W. Wade, "An Authorization Mechanism for a Relational Database System," ACM Trans. Database Systems, vol. 1, no. 3, pp. 242-255, 1976.
[20] J. Han and M. Kamber, Data Mining Concepts and Techniques, second ed. Morgan Kaufmann, 2006.
[21] S. Jajodia, P. Samarati, M.L. Sapino, and V.S. Subrahmanian, "Flexible Support for Multiple Access Control Policies," ACM Trans. Database Systems, vol. 26, no. 2, pp. 214-260, 2001.
[22] J.B. Kam and J.D. Ullman, "A Model of Statistical Databases and Their Security," ACM Trans. Database Systems, vol. 2, no. 1, pp. 1-10, 1977.
[23] K. Kenthapadi, N. Mishra, and K. Nissim, "Simulatable Auditing," Proc. 24th ACM SIGMOD-SIGACT-SIGART Symp. Principles of Database Systems, pp. 118-127, 2005.
[24] J. Kleinberg, C. Papadimitriou, and P. Raghavan, "Auditing Boolean Attributes," Proc. 19th ACM SIGMOD-SIGACT-SIGART Symp. Principles of Database Systems, pp. 86-91, 2000.
[25] K. Lefevre, D.J. Dewitt, and R. Ramakrishnan, "Incognito: Efficient Full-Domain $k$ -Anonymity," Proc. ACM SIGMOD, pp. 49-60, 2005.
[26] Y. Li, H. Lu, and R.H. Deng, "Practical Inference Control for Data Cubes," Proc. IEEE Symp. Security and Privacy, Extended Abstract, pp. 115-120, 2006.
[27] D.J. Martin, D. Kifer, A. Machanavajjhala, J. Gehrke, and J. Halpern, "Worst-Case Background Knowledge for Privacy-Preserving Data Publishing," Proc. 23rd IEEE Int'l Conf. Data Eng., pp. 126-135, 2007.
[28] G. Miklau and D. Suciu, "A Formal Analysis of Information Disclosure in Data Exchange," J. Computer and System Sciences, vol. 73, no. 3, pp. 507-534, 2007.
[29] S.U. Nabar, B. Marthi, K. Kenthapadi, N. Mishra, and R. Motwani, "Towards Robustness in Query Auditing," Proc. 32nd Int'l Conf. Very Large Data Bases, pp. 151-162, 2006.
[30] S.P. Reiss, "Security in Databases: A Combinatorial Study," J. ACM, vol. 26, no. 1, pp. 45-57, 1979.
[31] R.S. Sandhu, E.J. Coyne, H.L. Feinstein, and C.E. Youman, "Role-Based Access Control Models," Computer, vol. 29, no. 2, pp. 38-47, Feb. 1996.
[32] Y. Sung, Y. Liu, H. Xiong, and A. Ng, "Privacy Preservation for Data Cubes," Knowledge and Information Systems, vol. 9, no. 1, pp. 38-61, 2006.
[33] United States Dept. of Health and Human Services, Office for Civil Rights, "Summary of the HIPAA Privacy Rule," 2003.
[34] L. Wang, S. Jajodia, and D. Wijesekera, "Securing OLAP Data Cubes Against Privacy Breaches," Proc. 25th IEEE Symp. Security and Privacy, pp. 161-175, 2004.
[35] L. Wang, Y. Li, D. Wijesekera, and S. Jajodia, "Precisely Answering Multi-Dimensional Range Queries without Privacy Breaches," Proc. Eighth European Symp. Research in Computer Security, pp. 100-115, 2003.
[36] L. Wang, D. Wijesekera, and S. Jajodia, "Cardinality-Based Inference Control in Sum-Only Data Cubes," Proc. Seventh European Symp. Research in Computer Security, pp. 55-71, 2002.
[37] L. Wang, D. Wijesekera, and S. Jajodia, "Cardinality-Based Inference Control in Data Cubes," J. Computer Security, vol. 12, no. 5, pp. 655-692, 2004.
[38] L. Wang, D. Wijesekera, and S. Jajodia, "OLAP Means Online Anti-Privacy," Technical Report CSIS-TR-03-06, Center for Secure Information Systems, 2003.
[39] N. Zhang and W. Zhao, "Privacy-Preserving Data Mining Systems," Computer, vol. 40, no. 4, pp. 52-58, Apr. 2007.
[40] N. Zhang, W. Zhao, and J. Chen, "Cardinality-Based Inference Control in OLAP Systems: An Information Theoretic Approach," Proc. Seventh ACM Int'l Workshop Data Warehousing and OLAP, pp. 59-64, 2004.
[41] N. Zhang and W. Zhao, "An Information-Theoretic Approach for Privacy Protection in OLAP Systems," Technical Report TR-GWU-CS-09-003, Dept. of Computer Science, George Washington Univ., 2009.
6 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool