2010 IEEE 26th International Conference on Data Engineering (ICDE 2010) (2010)
Long Beach, CA, USA
Mar. 1, 2010 to Mar. 6, 2010
Yufei Tao , Chinese University of Hong Kong, Hong Kong
Jian Pei , Simon Fraser University, Canada
Jiexing Li , Chinese University of Hong Kong, Hong Kong
Xiaokui Xiao , Nanyang Technological University, Singapore
Ke Yi , Hong Kong University of Science and Technology, Hong Kong
Zhengzheng Xing , Simon Fraser University, Canada
Extracting useful correlation from a dataset has been extensively studied. In this paper, we deal with the opposite, namely, a problem we call correlation hiding (CH), which is fundamental in numerous applications that need to disseminate data containing sensitive information. In this problem, we are given a relational table T whose attributes can be classified into three disjoint sets A, B, and C. The objective is to distort some values in T so that A becomes independent from B, and yet, their correlation with C is preserved as much as possible. CH is different from all the problems studied previously in the area of data privacy, in that CH demands complete elimination of the correlation between two sets of attributes, whereas the previous research focuses on partial elimination up to a certain level. A new operator called independence masking is proposed to solve the CH problem. Implementations of the operator with good worst case guarantees are described in the full version of this short note.
J. Li, X. Xiao, K. Yi, Z. Xing, Y. Tao and J. Pei, "Correlation hiding by independence masking," 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010)(ICDE), Long Beach, CA, USA, 2010, pp. 964-967.