Third IEEE International Conference on Data Mining (2003)
Nov. 19, 2003 to Nov. 22, 2003
Hillol Kargupta , University of Maryland Baltimore County
Souptik Datta , University of Maryland Baltimore County
Qi Wang , Washington State University, Pullman
Krishnamoorthy Sivakumar , Washington State University, Pullman
Privacy is becoming an increasingly important issue in many data mining applications. This has triggered the development of many privacy-preserving data mining techniques. A large fraction of them use randomized data distortion techniques to mask the data for preserving the privacy of sensitive data. This methodology attempts to hide the sensitive data by randomly modifying the data values often using additive noise. This paper questions the utility of the random value distortion technique in privacy preservation. The paper notes that random objects (particularly random matrices) have "predictable" structures in the spectral domain and it develops a random matrix-based spectral filtering technique to retrieve original data from the dataset distorted by adding random values. The paper presents the theoretical foundation of this filtering method and extensive experimental results to demonstrate that in many cases random data distortion preserve very little data privacy. The paper also points out possible avenues for the development of new privacy-preserving data mining techniques like exploiting multiplicative and colored noise for preserving privacy in data mining applications.
H. Kargupta, K. Sivakumar, S. Datta and Q. Wang, "On the Privacy Preserving Properties of Random Data Perturbation Techniques," Third IEEE International Conference on Data Mining(ICDM), Melbourne, Florida, 2003, pp. 99.