This Article 
 Bibliographic References 
 Add to: 
Evaluating Tag-Based Preference Obfuscation Systems
Sept. 2012 (vol. 24 no. 9)
pp. 1613-1623
Andreas Pashalidis, K.U.Leuven, Leuven and IBBT, ESAT/SCD-COSIC
Bart Preneel, K.U.Leuven, Leuven and IBBT, ESAT/SCD-COSIC
While personalization is key to increase the usability of online services, disclosing one's preferences is undesirable from a privacy perspective, because it enables profiling through the linkage of what may otherwise be unlinkable service invocations. This paper considers an easily implementable class of obfuscation strategies as a means to mitigate these risks, and examines its privacy/utility tradeoff. Our results are based on simulations that take place within a modular evaluation framework that can seamlessly accommodate real-world data. We conducted experiments with different simulated behaviors and using two preference populations, namely a population of maximally diverse preferences and one consisting of the movie preferences of some Netflix users. We measure utility in a way that is specific to the application of preference obfuscation. Privacy is measured in terms of unlinkability, with respect to two different adversaries. Our results show that reasonable privacy/utility tradeoffs require the disclosure of only small amounts of preference information.

[1] F.L. Gandon and N.M. Sadeh, "Semantic Web Technologies to Reconcile Privacy and Context Awareness," Web Semantics: Science, Services and Agents on the World Wide Web, vol. 1, no. 3, pp. 241-260, 2004.
[2] D. Riboni, L. Pareschi, and C. Bettini, "Privacy in Georeferenced Context-Aware Services: A Survey," Proc. First Int'l Workshop Privacy in Location-Based Applications (PiLBA '08), Oct. 2008.
[3] N. Taylor, P. Robertson, B. Farshchian, K. Doolin, I. Roussaki, L. Marshall, R. Mullins, S. Druesedow, and K. Dolinar, "Pervasive Computing in Daidalos," IEEE Pervasive Computing, vol. 10, no. 1, pp. 74-81,, Jan. 2011.
[4] J. Camenisch and E. Van Herreweghen, "Design and Implementation of the Idemix Anonymous Credential System," Proc. Ninth ACM Conf. Computer and Comm. Security, pp. 21-30, 2002,
[5] K. Cameron and M.B. Jones, "Design Rationale Behind the Identity Metasystem Architecture," ISSE/SECURE Securing Electronic Business Processes, pp. 117-129, 2007.
[6] S. Clauß, D. Kesdogan, and T. Kölsch, "Privacy Enhancing Identity Management: Protection against Re-Identification and Profiling," Proc. Workshop Digital Identity Management, pp. 84-93, 2005.
[7] J.A. Pouwelse, P. Garbacki, J. Wang, A. Bakker, J. Yang, A. Iosup, D.H.J. Epema, M. Reinders, M.R. van Steen, and H.J. Sips, "Tribler: A Social-Based Peer-to-Peer System," Concurrency and Computation: Practice and Experience, vol. 20, no. 2, pp. 127-138,, 2008.
[8] M. Waaijers, J. Wang, J.A. Pouwelse, J. Fokker, A.P. de Vries, and M.J.T. Reinders, "Personalization on a Peer-to-Peer Television System," Int'l J. Multimedia Tools and Applications, vol. 36, nos. 1/2, pp. 89-113, Jan. 2008.
[9] M. Jakobsson, E. Stolterman, S. Wetzel, and L. Yang, "Love and Authentication," Proc. 26th Ann. Conf. Human Factors in Computing Systems (CHI '08), pp. 197-200, Apr. 2008.
[10] A. Krause and E. Horvitz, "A Utility-Theoretic Approach to Privacy and Personalization," Proc. 23rd Conf. Artificial Intelligence (AAAI '08), 2008.
[11] R.K. Chellappa and R.G. Sin, "Personalization versus Privacy: An Empirical Examination of the Online Consumers Dilemma," Information Technology and Management, vol. 6, nos. 2/3, pp. 181-202, May 2005.
[12] R. Wishart, K. Henricksen, and J. Indulska, "Context Privacy and Obfuscation Supported by Dynamic Context Source Discovery and Processing in a Context Management System," Proc. Ubiquitous Intelligence and Computing, pp. 929-940, 2007.
[13] A. Acquisti and J. Grossklags, "Privacy and Rationality in Individual Decision Making," IEEE Security and Privacy, vol. 3, pp. 26-33, http://portal.acm.orgcitation.cfm?id=1048715. 1048819 , Jan. 2005.
[14] F. McSherry and I. Mironov, "Differentially Private Recommender Systems: Building Privacy into the Net," Proc. 15th ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 627-636, 2009.
[15] R. Shokri, P. Pedarsani, G. Theodorakopoulos, and J.-P. Hubaux, "Preserving Privacy in Collaborative Filtering through Distributed Aggregation of Offline Profiles," Proc. Third ACM Conf. Recommender Systems (RecSys), 2009.
[16] Differential Privacy. M. Bugliesi, B. Preneel, V. Sassone and I. Wegener eds., Springer, 2006.
[17] J. Douceur, "The Sybil Attack," Proc. First Int'l Workshop Peer-to-Peer Systems (IPTPS '01), pp. 251-260, 2002.
[18] C.C. Aggarwal and P.S. Yu, "A General Survey of Privacy-Preserving Data Mining Models and Algorithms," Privacy-Preserving Data Mining: Models and Algorithms, Advannces in Database Systems, C. C. Aggarwal and P. S. Yu, eds., no. 34, ch. 2, pp. 11-52, Springer, 2008.
[19] V.S. Iyengar, "Transforming Data to Satisfy Privacy Constraints," Proc. Eighth ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 279-288, 2002.
[20] P. Samarati, "Protecting Respondents Identities in Microdata Release," IEEE Trans. Knowledge and Data Eng., vol. 13, no. 6, pp. 1010-1027, Nov./Dec. 2001.
[21] P. Samarati and L. Sweeney, "Protecting Privacy When Disclosing Information: K-Anonymity and Its Enforcement through Generalization and Suppression," Technical Report SRI-CSL-98-04, Computer Science Laboratory, 1998.
[22] L. Sweeney, "Datafly: A System for Providing Anonymity in Medical Data," Proc. IFIP TC11 WG11.3 11th Int'l Conf. Database Security XI: Status and Prospects, pp. 356-381, 1997.
[23] L. Willenborg and T. DeWaal, Elements of Statistical Disclosure Control. Springer, 2001.
[24] S. Clauß and S. Schiffner, "Structuring Anonymity Metrics," Proc. Second ACM Workshop Digital Identity Management (DIM '06), pp. 55-62, 2006.
[25] C. Díaz, "Anonymity Metrics Revisited," Anonymous Comm. and Its Applications, Number 05411 in Dagstuhl Seminar Proceedings, 2005.
[26] C. Díaz, S. Seys, J. Claessens, and B. Preneel, "Towards Measuring Anonymity," Proc. Second Privacy Enhancing Technologies Workshop, pp. 54-68, 2002.
[27] M. Edman, F. Sivrikaya, and B. Yener, "A Combinatorial Approach to Measuring Anonymity," Proc. IEEE Int'l Conf. Intelligence and Security Informatics, 2007.
[28] B. Gierlichs, C. Troncoso, C. Díaz, B. Preneel, and I. Verbauwhede, "Revisiting a Combinatorial Approach Toward Measuring Anonymity," Proc. Seventh ACM Workshop Privacy in the Electronic Soc. (WPES '08), pp. 111-116, 2008.
[29] V. Shmatikov and M.-H. Wang, "Measuring Relationship Anonymity in Mix Networks," Proc. Fifth ACM Workshop Privacy in Electronic Soc. (WPES '06), pp. 59-62, 2006.
[30] G. Tóth, Z. Hornák, and F. Vajda, "Measuring Anonymity Revisited," Proc. Ninth Nordic Workshop Secure IT Systems, pp. 85-90, Nov. 2004.
[31] D. Agrawal and C.C. Aggarwal, "On the Design and Quantification of Privacy Preserving Data Mining Algorithms," Proc. 20th ACM SIGACT-SIGMOD-SIGART Symp. Principles of Database Systems, 2001.
[32] R. Agrawal and R. Srikant, "Privacy-Preserving Data Mining," Proc. ACM SIGMOD Conf. Management of Data, pp. 439-450, 2000.
[33] A. Pashalidis and S. Schiffner, "Evaluating Adversarial Partitions," Proc. 15th European Conf. Research in Computer Security (ESORICS '10), pp. 524-539, 2010.
[34] L. Fischer, S. Katzenbeisser, and C. Eckert, "Measuring Unlinkability Revisited," Proc. ACM Workshop Privacy in the Electronic Soc. (WPES '08), pp. 111-116, 2008.
[35] M. Franz, B. Meyer, and A. Pashalidis, "Attacking Unlinkability: The Importance of Context," Proc. Seventh Int'l Conf. Privacy Enhancing Technologies (PET '07), pp. 1-16, 2007.
[36] A. Pashalidis, "Measuring the Effectiveness and the Fairness of Relation Hiding Systems," Proc. First Int'l Workshop Multimedia, Information Privacy and Intelligent Computing Systems, 2008.
[37] T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, second ed., ser. Springer Series in Statistics, Springer, Dec. 2008.
[38] X. An, D. Jutla, and N. Cercone, "Temporal Context Lie Detection and Generation," Proc. Third VLDB Int'l Conf. Secure Data Management (SDM '06), pp. 30-47, 2006.
[39] J. Lindamoodand and M. Kantarcioglu, "Inferring Private Information Using Social Network Data," Technical Report UTDCS-29-08, Univ. Texas at Dallas, July 2008.
[40] R.W. Klein and R.C. Dubes, "Experiments in Projection and Clustering by Simulated Annealing," Pattern Recognition, vol. 22, no. 2, pp. 213-220, 1989.
[41] S.E. Selim and K. Alsultan, "A Simulated Annealing Algorithm for the Clustering Problem," Pattern Recognition, vol. 24, no. 10, pp. 1003-1008, 1991.
[42] S. Schiffner and S. Clauß, "Using Linkability Information to Attack Mix-Based Anonymity Services," Proc. Ninth Int'l Symp. Privacy Enhancing Technologies (PETS '09), 2009.
[43] I.H. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques, second ed. Morgan Kaufmann, June 2005.
[44] R. Xu and D. Wunsch, Clustering. Wiley, Oct. 2008.
[45] W.W. Cohen, R.E. Schapire, and Y. Singer, "Learning to Order Things," Proc. Conf. Advances in Neural Information Processing Systems (NIPS '97), 1997.
[46] M. desJardins, E. Eaton, and K.L. Wagstaff, "Learning User Preferences for Sets of Objects," Proc. 23rd Int'l Conf. Machine Learning (ICML '06), pp. 273-280, 2006.
[47] J. Fürnkranz and E. Hüllermeier, "Pairwise Preference Learning and Ranking," Proc. 14th European Conf. Machine Learning (ECML), pp. 145-156, 2003.
[48] J. Fürnkranz and E. Hüllermeier, "Preference Learning," Proc. KI, 2005.
[49] K. Hindriks, C. Jonker, and W. Visser, "Reasoning About Multi-Attribute Preferences (Short Paper)," Proc. Eighth Int'l Conf. Autonomous Agents and Multiagent Systems (AAMAS '09), 2009.
[50] W.W. Cohen, R.E. Schapire, and Y. Singer, "Learning to Order Things," J. Artificial Intelligence Research, vol. 10, pp. 243-270, 1999.
[51] S. Holland, M. Ester, and W. Kießling, "Preference Mining: A Novel Approach on Mining User Preferences for Personalized Applications," Proc. Knowledge Discovery in Databases, pp. 204-216, 2003.
[52] P.B.-S. Ralf Herbirch, T. Graepel, and K. Obermayer, "Learning Preference Relations for Information Retrieval," Proc. Workshop Learning for Text Categorization, pp. 83-86, 1998.
[53] J. Delgrande, H. Tompits, T. Schaub, and K. Wang, "A Classification and Survey of Preference Handling Approaches in Nonmonotonic Reasoning," Computational Intelligence, vol. 20, no. 2, pp. 308-334, Apr. 2004.
[54] D. Herrmann, R. Wendolsky, and H. Federrath, "Website Fingerprinting: Attacking Popular Privacy Enhancing Technologies with the Multinomial Nave-Bayes Classifier," Proc. ACM Workshop Cloud Computing Security, pp. 31-42, 2009.
[55] S. Peddinti and N. Saxena, "On the Privacy of Web Search Based on Query Obfuscation: A Case Study of Trackmenot," Proc. 10th Int'l Conf. Privacy Enhancing Technologies, pp. 19-37, 2010.
[56] E. Kushilevitz and R. Ostrovsky, "Replication Is Not Needed: Single Database, Computationally-Private Information Retrieval," Proc. 38th Ann. Symp. Foundations of CS, pp. 364-373, 1997.
[57] A. Papoulis, Probability, Random Variables, and Stochastic Processes, third ed. McGraw-Hill Companies, 1991.
[58] E.T. Bell, "Exponential Numbers," Am. Math. Monthly, vol. 41, pp. 411-419, 1934.
[59] N.G. de Bruijn, Asymptotic Methods in Analysis. Dover Publications, 1981.
[60] A. Nijenhuis and H.S. Wilf, Combinatorial Algorithms, second ed., W. Rheinboldt, ed. Academic Press Inc., 1978.
[61] L. Devroye, Non-Uniform Random Variate Generation. Springer, 1986.
[62] A. Shepitsen, J. Gemmell, B. Mobasher, and R. Burke, "Personalized Recommendation in Social Tagging Systems Using Hierarchical Clustering," Proc. ACM Conf. Recommender Systems (RecSys '08), pp. 259-266, 2008.
[63] R. Burke, "Knowledge-Based Recommender Systems," Encyclopedia of Library and Information Systems, 2000.
[64] D. Kelly and J. Teevan, "Implicit Feedback for Inferring User Preference: A Bibliography," SIGIR Forum, vol. 37, no. 2, pp. 18-28, 2003.
[65] I.T. Jolliffe, Principal Component Analysis. Springer, 2002.
[66] F. Bacchus, A.J. Grove, J.Y. Halpern, and D. Koller, "From Statistical Knowledge Bases to Degrees of Belief," The Computing Research Repository (CoRR), vol. cs.AI/0307056, 2003.
[67] C. Troncoso, B. Gierlichs, B. Preneel, and I. Verbauwhede, "Perfect Matching Disclosure Attacks," Proc. Eighth Int'l Symp. Privacy Enhancing Technologies (PETS '08), pp. 2-23, 2008.
[68] R.M. Bell, Y. Koren, and C. Volinsky, "The BellKor Solution to the Netflix Prize," technical report, AT&T Labs Research, 2007.
[69] M.R. Jerrum, "The Complexity of Finding Minimum-Length Generator Sequences," Theoretical Computer Science, vol. 36, pp. 265-289, 1985.
[70] J.P.C. Vergara, "Sorting by Bounded Permutations," PhD dissertation, Virginia Polytechnic Inst. & State Univ., 1998.
[71] T. Schiavinotto and T. Stützle, "A Review of Metrics on Permutations for Search Landscape Analysis," Computers and Operations Research, vol. 34, no. 10, pp. 3143-3153, Oct. 2007.

Index Terms:
Privacy,Servers,Bayesian methods,Clustering algorithms,Simulated annealing,Partitioning algorithms,Probability distribution,unlinkability,Preferences,obfuscation,privacy
Andreas Pashalidis, Bart Preneel, "Evaluating Tag-Based Preference Obfuscation Systems," IEEE Transactions on Knowledge and Data Engineering, vol. 24, no. 9, pp. 1613-1623, Sept. 2012, doi:10.1109/TKDE.2011.118
Usage of this product signifies your acceptance of the Terms of Use.