The Community for Technology Leaders
RSS Icon
Issue No.02 - April-June (2008 vol.5)
pp: 87-98
The fundamental problem for inference control in data cubes is how to efficiently calculate the lower and upper bounds for each cell value given the aggregations of cell values over multiple dimensions. In this paper, we provide the first practical solution for estimating exact bounds in two-dimensional irregular data cubes (i.e., data cubes in which certain cell values are known to a snooper). Our results imply that the exact bounds cannot be obtained by a direct application of the Fr\\'{e}chet bounds in some cases. We then propose a new approach to improve the classic Fr\\'{e}chet bounds for any high-dimensional data cube in the most general case. The proposed approach improves upon the Fr\\'{e}chet bounds in the sense that it gives bounds that are at least as tight as those computed by Fr\\'{e}chet, yet is simpler in terms of time complexity. Based on our solutions to the fundamental problem, we discuss various security applications such as privacy protection of released data, fine-grained access control and auditing, and identify some future research directions.
Inference engines, Data dependencies
Haibing Lu, Yingjiu Li, "Practical Inference Control for Data Cubes", IEEE Transactions on Dependable and Secure Computing, vol.5, no. 2, pp. 87-98, April-June 2008, doi:10.1109/TDSC.2007.70217
[1] J. Gray, S. Chaudhuri, A. Bosworth, A. Layman, D. Reichart, M. Venkatrao, F. Pellow, and H. Pirahesh, “Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals,” Data Mining and Knowledge Discovery, vol. 1, no. 1, pp.29-53, 1997.
[2] S. Chaudhuri and U. Dayal, “An Overview of Data Warehousing and OLAP Technology,” SIGMOD Record, vol. 26, no. 1, pp. 65-74, 1997.
[3] G. Dong, J. Han, J.M.W. Lam, J. Pei, and K. Wang, “Mining Multi-Dimensional Constrained Gradients in Data Cubes,” Proc. 27th Int'l Conf. Very Large Data Bases, pp. 321-330, 2001.
[4] D.E. Denning, Cryptography and Data Security. Addison-Wesley, 1982.
[5] E. Bertino and R. Sandhu, “Database Security: Concepts, Approaches, and Challenges,” IEEE Trans. Dependable and Secure Computing, vol. 2, no. 1, pp. 2-19, Jan.-Mar. 2005.
[6] A. Dobra and S.E. Fienberg, “Bounds for Cell Entries in Contingency Tables Induced by Fixed Marginal Totals with Applications to Disclosure Limitation,” Statistical J. United States, vol. 18, pp. 363-371, 2001.
[7] M. Fréchet, Les Probabilities, Associées a un Système d'Événments Compatibles et Dépendants, vol. Premiere Partie, Hermann & Cie, 1940.
[8] Y. Li, H. Lu, and R.H. Deng, “Practical Inference Control for Data Cubes (extended abstract),” Proc. IEEE Symp. Security and Privacy, pp. 115-120, 2006.
[9] J.P. Ignizio and T.M. Cavalier, Linear Programming. Prentice Hall, 1994.
[10] A. Dobra, A. Karr, and A. Sanil, “Preserving Confidentiality of High-Dimensional Tabulated Data: Statistical and Computational Issues,” Statistics and Computing, vol. 13, pp. 363-370, 2003.
[11] L. Cox, “On Properties of Multi-Dimensional Statistical Tables,” J.Statistical Planning and Inference, vol. 117, no. 2, pp. 251-273, 2003.
[12] L. Cox, “Bounding Entries in 3-Dimensional Contingency Tables,” Inference Control in Statistical Databases: From Theory to Practice. Springer, pp. 21-33, 2002.
[13] S. Fienberg, “Fréchet and Bonferroni Bounds for Multi-Way Tables of Counts with Applications to Disclosure Limitation,” Proc. Conf. Statistical Data Protection, pp. 115-129, 1999.
[14] S. Chowdhury, G. Duncan, R. Krishnan, S. Roehrig, and S. Mukherjee, “Disclosure Detection in Multivariate Categorical Databases: Auditing Confidentiality Protection through Two New Matrix Operators,” Management Sciences, vol. 45, pp. 1710-1723, 1999.
[15] L. Buzzigoli and A. Giusti, “An Algorithm to Calculate the Lower and Upper Bounds of the Elements of an Array Given Its Marginals,” Proc. Conf. Statistical Data Protection, pp. 131-147, 1999.
[16] A. Dobra and S.E. Fienberg, “Bounds for Cell Entries in Contingency Tables Given Fixed Marginal Totals and Decomposable Graphs,” Proc. Nat'l Academy of Sciences, vol. 97, no. 22, pp.11885-11892, 2000.
[17] L. Wang, S. Jajodia, and D. Wijesekera, “Securing OLAP Data Cubes against Privacy Breaches,” Proc. IEEE Symp. Security and Privacy, pp. 161-175, 2004.
[18] B.K. Bhargava, “Security in Data Warehousing (Invited Talk),” Proc. Second Data Warehousing and Knowledge Discovery, pp. 287-289, 2000.
[19] L. Brankovic, P. Norak, M. Miller, and G. Wrightson, “Usability of Compromise-Free Statistical Databases,” Proc. Ninth Int'l Conf. Scientific and Statistical Database Management, pp. 144-154, 1997.
[20] L. Wang, D. Wijesekera, and S. Jajodia, “Cardinality-Based Inference Control in Sum-Only Data Cubes,” Proc. Seventh European Symp. Research in Computer Security, pp. 55-71, 2002.
[21] L. Wang, Y. Li, D. Wijesekera, and S. Jajodia, “Precisely Answering Multi-Dimensional Range Queries without Privacy Breaches,” Proc. Eighth European Symp. Research in Computer Security, pp. 100-115, 2003.
[22] N.R. Adam and J.C. Wortmann, “Security-Control Methods for Statistical Databases: A Comparative Study,” ACM Computing Surveys, vol. 21, no. 4, pp. 515-556, 1989.
[23] L. Willenborg and T. de Walal, Statistical Disclosure Control in Practice. Springer, 1996.
[24] J. Domingo-Ferrer, “Advances in Inference Control in Statistical Databases: An Overview,” Inference Control in Statistical Databases: From Theory to Practice, pp. 1-7, 2002.
[25] J.F. Traub, Y. Yemini, and H. Wozniakowski, “The Statistical Security of a Statistical Database,” ACM Trans. Database Systems, vol. 9, no. 4, pp. 672-679, 1984.
[26] Y. Li, L. Wang, and S. Jajodia, “Preventing Interval-Based Inference by Random Data Perturbation,” Privacy Enhancing Technologies, pp. 160-170, 2002.
[27] D. Agrawal and C.C. Aggarwal, “On the Design and Quantification of Privacy Preserving Data Mining Algorithms,” Proc. 20th ACM SIGACT-Sigmod-SIGART Symp. Principles of Database Systems, 2001.
[28] K. Muralidhar and R. Sarathy, “A General Additive Data Perturbation Method for Database Security,” Management Sciences, vol. 45, pp. 1399-1415, 1999.
[29] H. Kargupta, S. Datta, Q. Wang, and K. Sivakumar, “On the Privacy Preserving Properties of Random Data Perturbation Techniques,” Proc. Third IEEE Int'l Conf. Data Mining, pp. 99-106, 2003.
[30] Z. Huang, W. Du, and B. Chen, “Deriving Private Information from Randomized Data,” Proc. ACM SIGMOD '05, pp. 37-48, 2005.
[31] L.L. Beck, “A Security Mechanism for Statistical Databases,” ACM Trans. Database Systems, vol. 5, no. 3, pp. 316-338, 1980.
[32] J. Schlörer, “Security of Statistical Databases: Multidimensional Transformation,” ACM Trans. Database Systems, vol. 6, no. 1, pp.95-112, 1981.
[33] P. Samarati and L. Sweeney, “Protecting Privacy When Disclosing Information: k-Anonymity and Its Enforcement through Generalization and Suppression,” technical report, SRI Int'l, 1998.
[34] L. Sweeney, “Achieving k-Anonymity Privacy Protection Using Generalization and Suppression,” Int'l J. Uncertainty, Fuzziness and Knowledge-Based Systems, vol. 10, no. 5, pp. 571-588, 2002.
[35] F.Y.L. Chin and G. Özsoyoglu, “Statistical Database Design,” ACM Trans. Database Systems, vol. 6, no. 1, pp. 113-139, 1981.
[36] J. Schlörer, “Information Loss in Partitioned Statistical Databases,” Computer J., vol. 26, no. 3, pp. 218-223, 1983.
[37] J. Domingo-Ferrer and J.M. Mateo-Sanz, “Practical Data-Oriented Microaggregation for Statistical Disclosure Control,” IEEE Trans. Knowledge and Data Eng., vol. 14, no. 1, pp. 189-201, Jan. 2002.
[38] L.H. Cox, “Suppression Methodology and Statistical Disclosure Control,” J. Am. Statistical Assoc., vol. 75, no. 370, pp. 377-385, 1980.
[39] M. Fischetti and J.J. Salazar, “Solving the Cell Suppression Problem on Tabular Data with Linear Constraints,” Management Sciences, vol. 47, pp. 1008-1026, 2000.
[40] M. Fischetti and J.J. Salazar, “Partial Cell Suppression: A New Methodology for Statistical Disclosure Control,” Statistics and Computing, vol. 13, pp. 13-21, 2003.
29 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool