This Article 
 Bibliographic References 
 Add to: 
Association Rule Hiding
April 2004 (vol. 16 no. 4)
pp. 434-447

Abstract—Large repositories of data contain sensitive information that must be protected against unauthorized access. The protection of the confidentiality of this information has been a long-term goal for the database security research community and for the government statistical agencies. Recent advances in data mining and machine learning algorithms have increased the disclosure risks that one may encounter when releasing data to outside parties. A key problem, and still not sufficiently investigated, is the need to balance the confidentiality of the disclosed data with the legitimate needs of the data users. Every disclosure limitation method affects, in some way, and modifies true data values and relationships. In this paper, we investigate confidentiality issues of a broad category of rules, the association rules. In particular, we present three strategies and five algorithms for hiding a group of association rules, which is characterized as sensitive. One rule is characterized as sensitive if its disclosure risk is above a certain privacy threshold. Sometimes, sensitive rules should not be disclosed to the public since, among other things, they may be used for inferring sensitive data, or they may provide business competitors with an advantage. We also perform an evaluation study of the hiding algorithms in order to analyze their time complexity and the impact that they have in the original database.

[1] N.R. Adam and J.C. Wortmann, Security-Control Methods for Statistical Databases: A Comparison Study ACM Computing Surveys, vol. 21, no. 4, pp. 515-556, 1989.
[2] B. Thuraisingham and W. Ford, Security Constraint Processing in a Multilevel Secure Distributed Database Management System IEEE Trans. Knowledge and Data Eng., vol. 7, no. 2, pp. 274-293, Apr. 1995.
[3] D.G. Marks, Inference in MLS Database IEEE Trans. Knowledge and Data Eng., vol. 8, no. 1, pp. 46-55 Feb. 1996.
[4] U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, Advances in Knowledge Discovery and Data Mining. AAAI Press/The MIT Press, 1996.
[5] D.E. O'Leary, Knowledge Discovery as a Threat to Database Security Proc. First Int'l Conf. Knowledge Discovery and Databases, pp. 107-516, 1991.
[6] C. Clifton and D. Marks, Security and Privacy Implications of Data Mining Proc. 1996 ACM Workshop Data Mining and Knowledge Discovery, 1996.
[7] C. Clifton, Protecting against Data Mining through Samples Proc. 13th IFIP WG11.3 Conf. Database Security, 1999.
[8] T. Johnsten and V.V. Raghavan, Impact of Decision-Region Based Classification Mining Algorithms on Database Security Proc. 13th IFIP WG11.3 Conf. Database Security, 1999.
[9] M. Atallah, E. Bertino, A. Elmagarmid, M. Ibrahim, and V. Verykios, Disclosure Limitation Of Sensitive Rules Proc. Knowledge and Data Exchange Workshop, 1999.
[10] R. Agrawal and R. Srikant, Privacy Preserving Data Mining Proc. ACM SIGMOD Conf., 2000.
[11] D. Agrawal and C.C. Aggarwal, On the Design and Quantification of Privacy Preserving Data Mining Algorithms Proc. ACM PODS Conf., 2001.

Index Terms:
Privacy preserving data mining, association rule mining, sensitive rule hiding.
Vassilios S. Verykios, Ahmed K. Elmagarmid, Elisa Bertino, Yucel Saygin, Elena Dasseni, "Association Rule Hiding," IEEE Transactions on Knowledge and Data Engineering, vol. 16, no. 4, pp. 434-447, April 2004, doi:10.1109/TKDE.2004.1269668
Usage of this product signifies your acceptance of the Terms of Use.