The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.08 - August (2004 vol.16)
pp: 939-948
ABSTRACT
<p><b>Abstract</b>—The problem of disseminating a data set for machine learning while controlling the disclosure of data source identity is described using a commuting diagram of functions. This formalization is used to present and analyze an optimization problem balancing privacy and data utility requirements. The analysis points to the application of a generalization mechanism for maintaining privacy in view of machine learning needs. We present new proofs of NP-hardness of the problem of minimizing information loss while satisfying a set of privacy requirements, both with and without the addition of a particular uniform coding requirement. As an initial analysis of the approximation properties of the problem, we show that the cell suppression problem with a constant number of attributes can be approximated within a constant. As a side effect, proofs of NP-hardness of the minimum <tmath>k{\hbox{-}}{\rm{union}}</tmath>, maximum <tmath>k{\hbox{-}}{\rm{intersection}}</tmath>, and parallel versions of these are presented. Bounded versions of these problems are also shown to be approximable within a constant.</p>
INDEX TERMS
Privacy, disclosure control, combinatorial optimization, complexity, approximation properties, machine learning.
CITATION
Staal A. Vinterbo, "Privacy: A Machine Learning View", IEEE Transactions on Knowledge & Data Engineering, vol.16, no. 8, pp. 939-948, August 2004, doi:10.1109/TKDE.2004.31
21 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool