The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.08 - August (2009 vol.21)
pp: 1178-1190
Zhicheng Dou , Microsoft Research Asia, Beijing
Ruihua Song , Microsoft Research Asia, Beijing
Ji-Rong Wen , Microsoft Research Asia, Beijing
Xiaojie Yuan , Nankai University, Tianjin
ABSTRACT
Although personalized search has been under way for many years and many personalization algorithms have been investigated, it is still unclear whether personalization is consistently effective on different queries for different users and under different search contexts. In this paper, we study this problem and provide some findings. We present a large-scale evaluation framework for personalized search based on query logs and then evaluate five personalized search algorithms (including two click-based ones and three topical-interest-based ones) using 12-day query logs of Windows Live Search. By analyzing the results, we reveal that personalized Web search does not work equally well under various situations. It represents a significant improvement over generic Web search for some queries, while it has little effect and even harms query performance under some situations. We propose click entropy as a simple measurement on whether a query should be personalized. We further propose several features to automatically predict when a query will benefit from a specific personalization algorithm. Experimental results show that using a personalization algorithm for queries selected by our prediction model is better than using it simply for all queries.
INDEX TERMS
Web search, personalization, information filtering, performance evaluation.
CITATION
Zhicheng Dou, Ruihua Song, Ji-Rong Wen, Xiaojie Yuan, "Evaluating the Effectiveness of Personalized Web Search", IEEE Transactions on Knowledge & Data Engineering, vol.21, no. 8, pp. 1178-1190, August 2009, doi:10.1109/TKDE.2008.172
REFERENCES
[1] C. Silverstein, H. Marais, M. Henzinger, and M. Moricz, “Analysis of a Very Large Web Search Engine Query Log,” ACM SIGIR Forum, vol. 33, no. 1, pp. 6-12, 1999.
[2] B.J. Jansen, A. Spink, and T. Saracevic, “Real Life, Real Users, and Real Needs: A Study and Analysis of User Queries on the Web,” Information Processing and Management, vol. 36, no. 2, pp. 207-227, 2000.
[3] R. Krovetz and W.B. Croft, “Lexical Ambiguity and Information Retrieval,” Information Systems, vol. 10, no. 2, pp. 115-141, 1992.
[4] S. Cronen-Townsend and W.B. Croft, “Quantifying Query Ambiguity,” Proc. Second Int'l Conf. Human Language Technology Research (HLT '02), pp. 94-98, 2002.
[5] X. Shen, B. Tan, and C. Zhai, “Implicit User Modeling for Personalized Search,” Proc. ACM Int'l Conf. Information and Knowledge Management (CIKM '05), pp. 824-831, 2005.
[6] F. Qiu and J. Cho, “Automatic Identification of User Interest for Personalized Search,” Proc. 15th Int'l World Wide Web Conf. (WWW '06), pp. 727-736, 2006.
[7] J. Teevan, S.T. Dumais, and E. Horvitz, “Beyond the Commons: Investigating the Value of Personalizing Web Search,” Proc. Workshop New Technologies for Personalized Information Access (PIA), 2005.
[8] J. Pitkow, H. Schutze, T. Cass, R. Cooley, D. Turnbull, A. Edmonds, E. Adar, and T. Breuel, “Personalized Search,” Comm. ACM, vol. 45, no. 9, pp. 50-55, 2002.
[9] A. Pretschner and S. Gauch, “Ontology Based Personalized Search,” Proc. 11th IEEE Int'l Conf. Tools with Artificial Intelligence (ICTAI '99), pp. 391-398, 1999.
[10] B. Tan, X. Shen, and C. Zhai, “Mining Long-Term Search History to Improve Search Accuracy,” Proc. 12fth ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining (KDD '06), pp. 718-723, 2006.
[11] G. Jeh and J. Widom, “Scaling Personalized Web Search,” Proc. 12th Int'l World Wide Web Conf. (WWW '03), pp. 271-279, 2003.
[12] P. Ferragina and A. Gulli, “A Personalized Search Engine Based on Web-Snippet Hierarchical Clustering,” Special Interest Tracks and Posters of the 14th Int'l Conf. World Wide Web (WWW '05), pp. 801-810, 2005.
[13] J. Teevan, S.T. Dumais, and E. Horvitz, “Personalizing Search via Automated Analysis of Interests and Activities,” Proc. 28th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '05), pp. 449-456, 2005.
[14] J.-T. Sun, H.-J. Zeng, H. Liu, Y. Lu, and Z. Chen, “CubeSVD: ANovel Approach to Personalized Web Search,” Proc. 14th Int'l World Wide Web Conf. (WWW '05), pp. 382-390, 2005.
[15] F. Liu, C. Yu, and W. Meng, “Personalized Web Search by Mapping User Queries to Categories,” Proc. ACM Int'l Conf. Information and Knowledge Management (CIKM '02), pp. 558-565, 2002.
[16] P.-A. Chirita, W. Nejdl, R. Paiu, and C. Kohlschütter, “Using ODP Metadata to Personalize Search,” Proc. 28th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '05), pp. 178-185, 2005.
[17] A. Broder, “A Taxonomy of Web Search,” ACM SIGIR Forum, vol. 36, no. 2, pp. 3-10, 2002.
[18] U. Lee, Z. Liu, and J. Cho, “Automatic Identification of User Goals in Web Search,” Proc. 14th Int'l World Wide Web Conf. (WWW '05), pp. 391-400, 2005.
[19] X. Shen, B. Tan, and C. Zhai, “Context-Sensitive Information Retrieval Using Implicit Feedback,” Proc. 28th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '05), pp. 43-50, 2005.
[20] Windows Live Search, http:/www.live.com, 2006.
[21] J.-R. Wen, Z. Dou, and R. Song, “Personalized Web Search,” Encyclopedia of Database Systems, 2009.
[22] J.M. Carroll and M.B. Rosson, “Paradox of the Active User,” Interfacing Thought: Cognitive Aspects of Human-Computer Interaction, pp. 80-111, 1987.
[23] K. Sugiyama, K. Hatano, and M. Yoshikawa, “Adaptive Web Search Based on User Profile Constructed without Any Effort from Users,” Proc. 13th Int'l World Wide Web Conf. (WWW '04), pp. 675-684, 2004.
[24] F. Liu, C. Yu, and W. Meng, “Personalized Web Search for Improving Retrieval Effectiveness,” IEEE Trans. Knowledge and Data Eng., vol. 16, no. 1, pp. 28-40, Jan. 2004.
[25] P.A. Chirita, C. Firan, and W. Nejdl, “Summarizing Local Context to Personalize Global Web Search,” Proc. ACM Int'l Conf. Information and Knowledge Management (CIKM), 2006.
[26] J. Chaffee and S. Gauch, “Personal Ontologies for Web Navigation,” Proc. ACM Int'l Conf. Information and Knowledge Management (CIKM '00), pp. 227-234, 2000.
[27] S. Gauch, J. Chaffee, and A. Pretschner, “Ontology-Based Personalized Search and Browsing,” Web Intelligence and Agent Systems, vol. 1, no. 3/4, pp. 219-234, 2003.
[28] J. Trajkova and S. Gauch, “Improving Ontology-Based User Profiles,” Proc. Recherche d'Information Assistée par Ordinateur (RIAO '04), pp. 380-389, 2004.
[29] M. Speretta and S. Gauch, “Personalized Search Based on User Search Histories,” Proc. IEEE/WIC/ACM Int'l Conf. Web Intelligence (WI '05), pp. 622-628, 2005.
[30] L. Page, S. Brin, R. Motwani, and T. Winograd, “The PageRank Citation Ranking: Bringing Order to the Web,” technical report, Computer Science Dept., Stanford Univ., 1998.
[31] T.H. Haveliwala, “Topic-Sensitive Pagerank,” Proc. 11th Int'l World Wide Web Conf. (WWW), 2002.
[32] T. Sarlós, A.A. Benczúr, K. Csalogány, D. Fogaras, and B. Rácz, “To Randomize or Not to Randomize: Space Optimal Summaries for Hyperlink Analysis,” Proc. 15th Int'l World Wide Web Conf. (WWW '06), pp. 297-306, 2006.
[33] F. Tanudjaja and L. Mui, “Persona: A Contextualized and Personalized Web Search,” Proc. 35th Hawaii Int'l Conf. System Sciences (HICSS '02), vol. 3, p. 53, 2002.
[34] J.S. Breese, D. Heckerman, and C. Kadie, “Empirical Analysis of Predictive Algorithms for Collaborative Filtering,” Proc. 14th Conf. Uncertainty in Artificial Intelligence (UAI '98), pp. 43-52, 1998.
[35] P.A. Chirita, C.S. Firan, and W. Nejdl, “Personalized Query Expansion for the Web,” Proc. 30th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '07), pp. 7-14, 2007.
[36] J. Teevan, S.T. Dumais, and D.J. Liebling, “To Personalize or Not to Personalize: Modeling Queries with Variation in User Intent,” Proc. 31th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '08), 2008.
[37] S. Cronen-Townsend, Y. Zhou, and W.B. Croft, “Predicting Query Performance,” Proc. 25th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '02), pp. 299-306, 2002.
[38] E. Yom-Tov, S. Fine, D. Carmel, and A. Darlow, “Learning to Estimate Query Difficulty: Including Applications to Missing Content Detection and Distributed Information Retrieval,” Proc. 28th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '05), pp. 512-519, 2005.
[39] D. Carmel, E. Yom-Tov, A. Darlow, and D. Pelleg, “What Makes a Query Difficult?” Proc. 29th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '06), pp. 390-397, 2006.
[40] Y. Zhou and W.B. Croft, “Query Performance Prediction in Web Search Environments,” Proc. 30th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '07), pp.543-550, 2007.
[41] T. Joachims, L. Granka, B. Pan, H. Hembrooke, and G. Gay, “Accurately Interpreting Clickthrough Data as Implicit Feedback,” Proc. 28th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '05), pp. 154-161, 2005.
[42] Z. Guan and E. Cutrell, “An Eye Tracking Study of the Effect of Target Rank on Web Search,” Proc. ACM SIGCHI Conf. Human Factors in Computing Systems (CHI '07), pp. 417-420, 2007.
[43] J. Boyan, D. Freitag, and T. Joachims, “Evaluating Retrieval Performance Using Clickthrough Data,” Proc. AAAI Workshop Internet-Based Information Systems, 1996.
[44] J.C. Borda, “Mémoire sur les Élections au Scrution,” Histoire de l'Académie Royal des Sciences, 1781.
[45] C. Dwork, R. Kumar, M. Naor, and D. Sivakumar, “Rank Aggregation Methods for the Web,” Proc. 10th Int'l World Wide Web Conf. (WWW '01), pp. 613-622, 2001.
[46] Y. Li, Z. Zheng, and H.K. Dai, “KDD CUP-2005 Report: Facing a Great Challenge,” ACM SIGKDD Explorations Newsletter, vol. 7, no. 2, pp. 91-99, 2005.
[47] D. Shen, R. Pan, J.-T. Sun, J.J. Pan, K. Wu, J. Yin, and Q. Yang, “Q2C@UST: Our Winning Solution to Query Classification in KDDCUP 2005,” ACM SIGKDD Explorations Newsletter, vol. 7, no. 2, pp. 100-110, 2005.
[48] M. Isard, M. Budiu, Y. Yu, A. Birrell, and D. Fetterly, “Dryad: Distributed Data-Parallel Programs from Sequential Building Blocks,” Proc. EuroSys '07, pp. 59-72, 2007.
[49] W. Lin, M. Yang, L. Zhang, and L. Zhou, “Pacifica: Replication in Log-Based Distributed Storage Systems,” Technical Report MSR-TR-2008-25, Micorsoft Research, 2008.
[50] Y. Xie and D.R. O'Hallaron, “Locality in Search Engine Queries and Its Implications for Caching,” Proc. IEEE INFOCOM, 2002.
[51] B.J. Jansen, A. Spink, J. Bateman, and T. Saracevic, “Real Life Information Retrieval: A Study of User Queries on the Web,” ACM SIGIR Forum, vol. 32, no. 1, pp. 5-17, 1998.
[52] S. Wedig and O. Madani, “A Large-Scale Analysis of Query Logs for Assessing Personalization Opportunities,” Proc. 12th ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining (KDD '06), pp. 742-747, 2006.
[53] S.M. Beitzel, E.C. Jensen, A. Chowdhury, D. Grossman, and O. Frieder, “Hourly Analysis of a Very Large Topically Categorized Web Query Log,” Proc. 27th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '04), pp.321-328, 2004.
[54] J. Teevan, E. Adar, R. Jones, and M. Potts, “History Repeats Itself: Repeat Queries in Yahoo's Logs,” Proc. ACM SIGIR '06, pp. 703-704, 2006.
[55] R. Song, Z. Luo, J.-R. Wen, Y. Yu, and H.-W. Hon, “Identifying Ambiguous Queries in Web Search,” Proc. 16th Int'l World Wide Web Conf. (WWW '07), pp. 1169-1170, 2007.
[56] A. Jain and D. Zongker, “Feature Selection: Evaluation, Application, and Small Sample Performance,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 2, pp. 153-158, Feb. 1997.
6 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool