Subscribe

Issue No.09 - Sept. (2012 vol.24)

pp: 1613-1623

Andreas Pashalidis , K.U.Leuven, Leuven and IBBT, ESAT/SCD-COSIC

Bart Preneel , K.U.Leuven, Leuven and IBBT, ESAT/SCD-COSIC

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2011.118

ABSTRACT

While personalization is key to increase the usability of online services, disclosing one's preferences is undesirable from a privacy perspective, because it enables profiling through the linkage of what may otherwise be unlinkable service invocations. This paper considers an easily implementable class of obfuscation strategies as a means to mitigate these risks, and examines its privacy/utility tradeoff. Our results are based on simulations that take place within a modular evaluation framework that can seamlessly accommodate real-world data. We conducted experiments with different simulated behaviors and using two preference populations, namely a population of maximally diverse preferences and one consisting of the movie preferences of some Netflix users. We measure utility in a way that is specific to the application of preference obfuscation. Privacy is measured in terms of unlinkability, with respect to two different adversaries. Our results show that reasonable privacy/utility tradeoffs require the disclosure of only small amounts of preference information.

INDEX TERMS

Privacy, Servers, Bayesian methods, Clustering algorithms, Simulated annealing, Partitioning algorithms, Probability distribution, unlinkability, Preferences, obfuscation, privacy

CITATION

Andreas Pashalidis, Bart Preneel, "Evaluating Tag-Based Preference Obfuscation Systems",

*IEEE Transactions on Knowledge & Data Engineering*, vol.24, no. 9, pp. 1613-1623, Sept. 2012, doi:10.1109/TKDE.2011.118REFERENCES

- [1] F.L. Gandon and N.M. Sadeh, "Semantic Web Technologies to Reconcile Privacy and Context Awareness,"
Web Semantics: Science, Services and Agents on the World Wide Web, vol. 1, no. 3, pp. 241-260, 2004.- [2] D. Riboni, L. Pareschi, and C. Bettini, "Privacy in Georeferenced Context-Aware Services: A Survey,"
Proc. First Int'l Workshop Privacy in Location-Based Applications (PiLBA '08), Oct. 2008.- [3] N. Taylor, P. Robertson, B. Farshchian, K. Doolin, I. Roussaki, L. Marshall, R. Mullins, S. Druesedow, and K. Dolinar, "Pervasive Computing in Daidalos,"
IEEE Pervasive Computing, vol. 10, no. 1, pp. 74-81, http://dx.doi.org/10.1109MPRV.2010.24, Jan. 2011.- [4] J. Camenisch and E. Van Herreweghen, "Design and Implementation of the Idemix Anonymous Credential System,"
Proc. Ninth ACM Conf. Computer and Comm. Security, pp. 21-30, 2002,- [5] K. Cameron and M.B. Jones, "Design Rationale Behind the Identity Metasystem Architecture,"
ISSE/SECURE Securing Electronic Business Processes, pp. 117-129, 2007.- [6] S. Clauß, D. Kesdogan, and T. Kölsch, "Privacy Enhancing Identity Management: Protection against Re-Identification and Profiling,"
Proc. Workshop Digital Identity Management, pp. 84-93, 2005.- [7] J.A. Pouwelse, P. Garbacki, J. Wang, A. Bakker, J. Yang, A. Iosup, D.H.J. Epema, M. Reinders, M.R. van Steen, and H.J. Sips, "Tribler: A Social-Based Peer-to-Peer System,"
Concurrency and Computation: Practice and Experience, vol. 20, no. 2, pp. 127-138, http://dx.doi.org/10.1002cpe.1189, 2008.- [8] M. Waaijers, J. Wang, J.A. Pouwelse, J. Fokker, A.P. de Vries, and M.J.T. Reinders, "Personalization on a Peer-to-Peer Television System,"
Int'l J. Multimedia Tools and Applications, vol. 36, nos. 1/2, pp. 89-113, Jan. 2008.- [9] M. Jakobsson, E. Stolterman, S. Wetzel, and L. Yang, "Love and Authentication,"
Proc. 26th Ann. Conf. Human Factors in Computing Systems (CHI '08), pp. 197-200, Apr. 2008.- [10] A. Krause and E. Horvitz, "A Utility-Theoretic Approach to Privacy and Personalization,"
Proc. 23rd Conf. Artificial Intelligence (AAAI '08), 2008.- [11] R.K. Chellappa and R.G. Sin, "Personalization versus Privacy: An Empirical Examination of the Online Consumers Dilemma,"
Information Technology and Management, vol. 6, nos. 2/3, pp. 181-202, May 2005.- [12] R. Wishart, K. Henricksen, and J. Indulska, "Context Privacy and Obfuscation Supported by Dynamic Context Source Discovery and Processing in a Context Management System,"
Proc. Ubiquitous Intelligence and Computing, pp. 929-940, 2007.- [13] A. Acquisti and J. Grossklags, "Privacy and Rationality in Individual Decision Making,"
IEEE Security and Privacy, vol. 3, pp. 26-33, http://portal.acm.orgcitation.cfm?id=1048715. 1048819 , Jan. 2005.- [14] F. McSherry and I. Mironov, "Differentially Private Recommender Systems: Building Privacy into the Net,"
Proc. 15th ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 627-636, 2009.- [15] R. Shokri, P. Pedarsani, G. Theodorakopoulos, and J.-P. Hubaux, "Preserving Privacy in Collaborative Filtering through Distributed Aggregation of Offline Profiles,"
Proc. Third ACM Conf. Recommender Systems (RecSys), 2009.- [16]
Differential Privacy. M. Bugliesi, B. Preneel, V. Sassone and I. Wegener eds., Springer, 2006.- [17] J. Douceur, "The Sybil Attack,"
Proc. First Int'l Workshop Peer-to-Peer Systems (IPTPS '01), pp. 251-260, 2002.- [18] C.C. Aggarwal and P.S. Yu, "A General Survey of Privacy-Preserving Data Mining Models and Algorithms,"
Privacy-Preserving Data Mining: Models and Algorithms, Advannces in Database Systems, C. C. Aggarwal and P. S. Yu, eds., no. 34, ch. 2, pp. 11-52, Springer, 2008.- [19] V.S. Iyengar, "Transforming Data to Satisfy Privacy Constraints,"
Proc. Eighth ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 279-288, 2002.- [20] P. Samarati, "Protecting Respondents Identities in Microdata Release,"
IEEE Trans. Knowledge and Data Eng., vol. 13, no. 6, pp. 1010-1027, Nov./Dec. 2001.- [21] P. Samarati and L. Sweeney, "Protecting Privacy When Disclosing Information: K-Anonymity and Its Enforcement through Generalization and Suppression," Technical Report SRI-CSL-98-04, Computer Science Laboratory, 1998.
- [22] L. Sweeney, "Datafly: A System for Providing Anonymity in Medical Data,"
Proc. IFIP TC11 WG11.3 11th Int'l Conf. Database Security XI: Status and Prospects, pp. 356-381, 1997.- [23] L. Willenborg and T. DeWaal,
Elements of Statistical Disclosure Control. Springer, 2001.- [24] S. Clauß and S. Schiffner, "Structuring Anonymity Metrics,"
Proc. Second ACM Workshop Digital Identity Management (DIM '06), pp. 55-62, 2006.- [25] C. Díaz, "Anonymity Metrics Revisited,"
Anonymous Comm. and Its Applications, Number 05411 in Dagstuhl Seminar Proceedings, 2005.- [26] C. Díaz, S. Seys, J. Claessens, and B. Preneel, "Towards Measuring Anonymity,"
Proc. Second Privacy Enhancing Technologies Workshop, pp. 54-68, 2002.- [27] M. Edman, F. Sivrikaya, and B. Yener, "A Combinatorial Approach to Measuring Anonymity,"
Proc. IEEE Int'l Conf. Intelligence and Security Informatics, 2007.- [28] B. Gierlichs, C. Troncoso, C. Díaz, B. Preneel, and I. Verbauwhede, "Revisiting a Combinatorial Approach Toward Measuring Anonymity,"
Proc. Seventh ACM Workshop Privacy in the Electronic Soc. (WPES '08), pp. 111-116, 2008.- [29] V. Shmatikov and M.-H. Wang, "Measuring Relationship Anonymity in Mix Networks,"
Proc. Fifth ACM Workshop Privacy in Electronic Soc. (WPES '06), pp. 59-62, 2006.- [30] G. Tóth, Z. Hornák, and F. Vajda, "Measuring Anonymity Revisited,"
Proc. Ninth Nordic Workshop Secure IT Systems, pp. 85-90, Nov. 2004.- [31] D. Agrawal and C.C. Aggarwal, "On the Design and Quantification of Privacy Preserving Data Mining Algorithms,"
Proc. 20th ACM SIGACT-SIGMOD-SIGART Symp. Principles of Database Systems, 2001.- [32] R. Agrawal and R. Srikant, "Privacy-Preserving Data Mining,"
Proc. ACM SIGMOD Conf. Management of Data, pp. 439-450, 2000.- [33] A. Pashalidis and S. Schiffner, "Evaluating Adversarial Partitions,"
Proc. 15th European Conf. Research in Computer Security (ESORICS '10), pp. 524-539, 2010.- [34] L. Fischer, S. Katzenbeisser, and C. Eckert, "Measuring Unlinkability Revisited,"
Proc. ACM Workshop Privacy in the Electronic Soc. (WPES '08), pp. 111-116, 2008.- [35] M. Franz, B. Meyer, and A. Pashalidis, "Attacking Unlinkability: The Importance of Context,"
Proc. Seventh Int'l Conf. Privacy Enhancing Technologies (PET '07), pp. 1-16, 2007.- [36] A. Pashalidis, "Measuring the Effectiveness and the Fairness of Relation Hiding Systems,"
Proc. First Int'l Workshop Multimedia, Information Privacy and Intelligent Computing Systems, 2008.- [37] T. Hastie, R. Tibshirani, and J. Friedman,
The Elements of Statistical Learning: Data Mining, Inference, and Prediction, second ed., ser. Springer Series in Statistics, Springer, Dec. 2008.- [38] X. An, D. Jutla, and N. Cercone, "Temporal Context Lie Detection and Generation,"
Proc. Third VLDB Int'l Conf. Secure Data Management (SDM '06), pp. 30-47, 2006.- [39] J. Lindamoodand and M. Kantarcioglu, "Inferring Private Information Using Social Network Data," Technical Report UTDCS-29-08, Univ. Texas at Dallas, July 2008.
- [40] R.W. Klein and R.C. Dubes, "Experiments in Projection and Clustering by Simulated Annealing,"
Pattern Recognition, vol. 22, no. 2, pp. 213-220, 1989.- [41] S.E. Selim and K. Alsultan, "A Simulated Annealing Algorithm for the Clustering Problem,"
Pattern Recognition, vol. 24, no. 10, pp. 1003-1008, 1991.- [42] S. Schiffner and S. Clauß, "Using Linkability Information to Attack Mix-Based Anonymity Services,"
Proc. Ninth Int'l Symp. Privacy Enhancing Technologies (PETS '09), 2009.- [43] I.H. Witten and E. Frank,
Data Mining: Practical Machine Learning Tools and Techniques, second ed. Morgan Kaufmann, June 2005.- [44] R. Xu and D. Wunsch,
Clustering. Wiley, Oct. 2008.- [45] W.W. Cohen, R.E. Schapire, and Y. Singer, "Learning to Order Things,"
Proc. Conf. Advances in Neural Information Processing Systems (NIPS '97), 1997.- [46] M. desJardins, E. Eaton, and K.L. Wagstaff, "Learning User Preferences for Sets of Objects,"
Proc. 23rd Int'l Conf. Machine Learning (ICML '06), pp. 273-280, 2006.- [47] J. Fürnkranz and E. Hüllermeier, "Pairwise Preference Learning and Ranking,"
Proc. 14th European Conf. Machine Learning (ECML), pp. 145-156, 2003.- [48] J. Fürnkranz and E. Hüllermeier, "Preference Learning,"
Proc. KI, 2005.- [49] K. Hindriks, C. Jonker, and W. Visser, "Reasoning About Multi-Attribute Preferences (Short Paper),"
Proc. Eighth Int'l Conf. Autonomous Agents and Multiagent Systems (AAMAS '09), 2009.- [50] W.W. Cohen, R.E. Schapire, and Y. Singer, "Learning to Order Things,"
J. Artificial Intelligence Research, vol. 10, pp. 243-270, 1999.- [51] S. Holland, M. Ester, and W. Kießling, "Preference Mining: A Novel Approach on Mining User Preferences for Personalized Applications,"
Proc. Knowledge Discovery in Databases, pp. 204-216, 2003.- [52] P.B.-S. Ralf Herbirch, T. Graepel, and K. Obermayer, "Learning Preference Relations for Information Retrieval,"
Proc. Workshop Learning for Text Categorization, pp. 83-86, 1998.- [53] J. Delgrande, H. Tompits, T. Schaub, and K. Wang, "A Classification and Survey of Preference Handling Approaches in Nonmonotonic Reasoning,"
Computational Intelligence, vol. 20, no. 2, pp. 308-334, Apr. 2004.- [54] D. Herrmann, R. Wendolsky, and H. Federrath, "Website Fingerprinting: Attacking Popular Privacy Enhancing Technologies with the Multinomial Nave-Bayes Classifier,"
Proc. ACM Workshop Cloud Computing Security, pp. 31-42, 2009.- [55] S. Peddinti and N. Saxena, "On the Privacy of Web Search Based on Query Obfuscation: A Case Study of Trackmenot,"
Proc. 10th Int'l Conf. Privacy Enhancing Technologies, pp. 19-37, 2010.- [56] E. Kushilevitz and R. Ostrovsky, "Replication Is Not Needed: Single Database, Computationally-Private Information Retrieval,"
Proc. 38th Ann. Symp. Foundations of CS, pp. 364-373, 1997.- [57] A. Papoulis,
Probability, Random Variables, and Stochastic Processes, third ed. McGraw-Hill Companies, 1991.- [58] E.T. Bell, "Exponential Numbers,"
Am. Math. Monthly, vol. 41, pp. 411-419, 1934.- [59] N.G. de Bruijn,
Asymptotic Methods in Analysis. Dover Publications, 1981.- [60] A. Nijenhuis and H.S. Wilf,
Combinatorial Algorithms, second ed., W. Rheinboldt, ed. Academic Press Inc., 1978.- [61] L. Devroye,
Non-Uniform Random Variate Generation. Springer, 1986.- [62] A. Shepitsen, J. Gemmell, B. Mobasher, and R. Burke, "Personalized Recommendation in Social Tagging Systems Using Hierarchical Clustering,"
Proc. ACM Conf. Recommender Systems (RecSys '08), pp. 259-266, 2008.- [63] R. Burke, "Knowledge-Based Recommender Systems,"
Encyclopedia of Library and Information Systems, 2000.- [64] D. Kelly and J. Teevan, "Implicit Feedback for Inferring User Preference: A Bibliography,"
SIGIR Forum, vol. 37, no. 2, pp. 18-28, 2003.- [65] I.T. Jolliffe,
Principal Component Analysis. Springer, 2002.- [66] F. Bacchus, A.J. Grove, J.Y. Halpern, and D. Koller, "From Statistical Knowledge Bases to Degrees of Belief,"
The Computing Research Repository (CoRR), vol. cs.AI/0307056, 2003.- [67] C. Troncoso, B. Gierlichs, B. Preneel, and I. Verbauwhede, "Perfect Matching Disclosure Attacks,"
Proc. Eighth Int'l Symp. Privacy Enhancing Technologies (PETS '08), pp. 2-23, 2008.- [68] R.M. Bell, Y. Koren, and C. Volinsky, "The BellKor Solution to the Netflix Prize," technical report, AT&T Labs Research, 2007.
- [69] M.R. Jerrum, "The Complexity of Finding Minimum-Length Generator Sequences,"
Theoretical Computer Science, vol. 36, pp. 265-289, 1985.- [70] J.P.C. Vergara, "Sorting by Bounded Permutations," PhD dissertation, Virginia Polytechnic Inst. & State Univ., 1998.
- [71] T. Schiavinotto and T. Stützle, "A Review of Metrics on Permutations for Search Landscape Analysis,"
Computers and Operations Research, vol. 34, no. 10, pp. 3143-3153, Oct. 2007. |