This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Mining Web Graphs for Recommendations
June 2012 (vol. 24 no. 6)
pp. 1051-1064
Hao Ma, The Chinese University of Hong Kong, Hong Kong
Irwin King, The Chinese University of Hong Kong, Hong Kong
Michael Rung-Tsong Lyu, The Chinese University of Hong Kong, Hong Kong
As the exponential explosion of various contents generated on the Web, Recommendation techniques have become increasingly indispensable. Innumerable different kinds of recommendations are made on the Web every day, including movies, music, images, books recommendations, query suggestions, tags recommendations, etc. No matter what types of data sources are used for the recommendations, essentially these data sources can be modeled in the form of various types of graphs. In this paper, aiming at providing a general framework on mining Web graphs for recommendations, 1) we first propose a novel diffusion method which propagates similarities between different nodes and generates recommendations; 2) then we illustrate how to generalize different recommendation problems into our graph diffusion framework. The proposed framework can be utilized in many recommendation tasks on the World Wide Web, including query suggestions, tag recommendations, expert finding, image recommendations, image annotations, etc. The experimental analysis on large data sets shows the promising future of our work.

[1] E. Agichtein, E. Brill, and S. Dumais, "Improving Web Search Ranking by Incorporating User Behavior Information," SIGIR '07: Proc. 29th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 19-26, 2006.
[2] E. Auchard, "Flickr to Map the World's Latest Photo Hotspots," Proc. Reuters, 2007.
[3] R. TiberiBaeza-Yates and A. Tiberi, "Extracting Semantic Relations from Query Logs," KDD '07: Proc. 13th ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 76-85, 2007.
[4] R.A. Baeza-Yates, C.A. Hurtado, and M. Mendoza, "Query Recommendation Using Query Logs in Search Engines," Proc. Current Trends in Database Technology (EDBT) Workshops, pp. 588-596, 2004.
[5] D. Beeferman and A. Berger, "Agglomerative Clustering of a Search Engine Query Log," KDD '00: Proc. Sixth ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 407-416, 2000.
[6] M. Belkin and P. Niyogi, "Laplacian Eigenmaps for Dimensionality Reduction and Data Representation," Neural Computation, vol. 15, no. 6, pp. 1373-1396, 2003.
[7] J.S. Breese, D. Heckerman, and C. Kadie, "Empirical Analysis of Predictive Algorithms for Collaborative Filtering," Proc. 14th Conf. Uncertainty in Artificial Intelligence (UAI), 1998.
[8] S. Brin and L. Page, "The Anatomy of a Large-Scale Hypertextual Web Search Engine," Computer Networks and ISDN Systems, vol. 30, nos. 1-7, pp. 107-117, 1998.
[9] J. Canny, "Collaborative Filtering with Privacy via Factor Analysis," SIGIR '07: Proc. 25th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 238-245, 2002.
[10] H. Cao, D. Jiang, J. Pei, Q. He, Z. Liao, E. Chen, and H. Li, "Context-Aware Query Suggestion by Mining Click-Through and Session Data," KDD '08: Proc. 14th ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 875-883, 2008.
[11] P.A. Chirita, C.S. Firan, and W. Nejdl, "Personalized Query Expansion for the Web," SIGIR '07: Proc. 30th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 7-14, 2007.
[12] N. Craswell and M. Szummer, "Random Walks on the Click Graph," SIGIR '07: Proc. 30th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 239-246, 2007.
[13] H. Cui, J.-R. Wen, J.-Y. Nie, and W.-Y. Ma, "Query Expansion by Mining User Logs," IEEE Trans. Knowledge Data Eng., vol. 15, no. 4, pp. 829-839, July/Aug. 2003.
[14] A.S. Das, M. Datar, A. Garg, and S. Rajaram, "Google News Personalization: Scalable Online Collaborative Filtering," WWW '07: Proc. 16th Int'l Conf. World Wide Web, pp. 271-280, 2007.
[15] M. Deshpande and G. Karypis, "Item-Based Top-n Recommendation," ACM Trans. Information Systems, vol. 22, no. 1, pp. 143-177, 2004.
[16] G. Dupret and M. Mendoza, "Automatic Query Recommendation Using Click-Through Data," Proc. Int'l Federation for Information Processing, Professional Practice in Artificial Intelligence (IFIP PPAI), pp. 303-312, 2006.
[17] N. Eiron, K.S. McCurley, and J.A. Tomlin, "Ranking the Web Frontier," WWW '04: Proc. 13th Int'l Conf. World Wide Web, pp. 309-318, 2004.
[18] W. Gao, C. Niu, J.-Y. Nie, M. Zhou, J. Hu, K.-F. Wong, and H.-W. Hon, "Cross-Lingual Query Suggestion Using Query Logs of Different Languages," SIGIR '07: Proc. 30th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 463-470, 2007.
[19] T. Haveliwala, A. Kamvar, and G. Jeh, "An Analytical Comparison of Approaches to Personalizing Pagerank," technical report, 2003.
[20] T.H. Haveliwala, "Topic-Sensitive Pagerank," WWW '04: Proc. 11th Int'l Conf. World Wide Web, pp. 517-526, 2002.
[21] J.L. Herlocker, J.A. Konstan, A. Borchers, and J. Riedl, "An Algorithmic Framework for Performing Collaborative Filtering," SIGIR '99: Proc. 22nd Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 230-237, 1999.
[22] J.L. Herlocker, J.A. Konstan, L.G. Terveen, and J.T. Riedl, "Evaluating Collaborative Filtering Recommender Systems," ACM Trans. Information Systems, vol. 22, no. 1, pp. 5-53, 2004.
[23] T. Hofmann, "Collaborative Filtering via Gaussian Probabilistic Latent Semantic Analysis," SIGIR '03: Proc. 26th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 259-266, 2003.
[24] T. Hofmann, "Latent Semantic Models for Collaborative Filtering," ACM Trans. Information Systems, vol. 22, no. 1, pp. 89-115, 2004.
[25] Z. Huang, H. Chen, and D. Zeng, "Applying Associative Retrieval Techniques to Alleviate the Sparsity Problem in Collaborative Filtering," ACM Trans. Information Systems, vol. 22, no. 1, pp. 116-142, 2004.
[26] B.J. Jansen, A. Spink, J. Bateman, and T. Saracevic, "Real Life Information Retrieval: A Study of User Queries on the Web," ACM SIGIR Forum, vol. 32, no. 1, pp. 5-17, 1998.
[27] G. Jeh and J. Widom, "Simrank: A Measure of Structural-Context Similarity," KDD '02: Proc. Eighth ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 538-543, 2002.
[28] G. Jeh and J. Widom, "Scaling Personalized Web Search," WWW '04: Proc. 12th Int'l Conf. World Wide Web, pp. 271-279, 2003.
[29] T. Joachims, "Optimizing Search Engines Using Clickthrough Data," KDD '02: Proc. Eighth ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 133-142, 2002.
[30] T. Joachims and F. Radlinski, "Search Engines that Learn from Implicit Feedback," Computer, vol. 40, no. 8, pp. 34-40, 2007.
[31] R. Jones, B. Rey, O. Madani, and W. Greiner, "Generating Query Substitutions," WWW '06: Proc. 15th Int'l Conf. World Wide Web, pp. 387-396, 2006.
[32] J.M. Kleinberg, "Authoritative Sources in a Hyperlinked Environment," J. ACM, vol. 46, no. 5, pp. 604-632, 1999.
[33] A. Kohrs and B. Merialdo, "Clustering for Collaborative Filtering Applications," Proc. Computational Intelligence for Modelling, Control and Automation (CIMCA), 1999.
[34] R.I. Kondor and J.D. Lafferty, "Diffusion Kernels on Graphs and Other Discrete Input Spaces," ICML '02: Proc. 19th Int'l Conf. Machine Learning, pp. 315-322, 2002.
[35] R. Kraft and J. Zien, "Mining Anchor Text for Query Refinement," WWW '04: Proc 13th Int'l Conf. World Wide Web, pp. 666-674, 2004.
[36] J.D. Lafferty and G. Lebanon, "Diffusion Kernels on Statistical Manifolds," J. Machine Learning Research, vol. 6, pp. 129-163, 2005.
[37] G. Linden, B. Smith, and J. York, "Amazon.com Recommendations: Item-to-Item Collaborative Filtering," IEEE Internet Computing, vol. 7, no. 1, pp. 76-80, Jan./Feb. 2003.
[38] H. Ma, I. King, and M.R. Lyu, "Effective Missing Data Prediction for Collaborative Filtering," SIGIR '07: Proc. 30th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 39-46, 2007.
[39] H. Ma, I. King, and M.R. Lyu, "Learning to Recommend with Social Trust Ensemble," SIGIR '09: Proc. 32nd Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 203-210, 2009.
[40] H. Ma, H. Yang, M.R. Lyu, and I. King, "SoRec: Social Recommendation Using Probabilistic Matrix Factorization," CIKM '08: Proc. 17th ACM Conf. Information and Knowledge Management, pp. 931-940, 2008.
[41] B. Marlin, "Modeling User Rating Profiles for Collaborative Filtering," Advances in Neural Information Processing Systems 16, S. Thrun, L. Saul, and B. Schölkopf, eds., MIT Press, 2004.
[42] Q. Mei, D. Zhou, and K. Church, "Query Suggestion Using Hitting Time," CIKM '08: Proc. 17th ACM Conf. Information and Knowledge Management, pp. 469-477, 2008.
[43] L. Page, S. Brin, R. Motwani, and T. Winograd, "The Pagerank Citation Ranking: Bringing Order to the Web," Technical Report Paper SIDL-WP-1999-0120 (Version of 11/11/1999), 1999.
[44] M. Pasca and B.V. Durme, "What You Seek Is What You Get: Extraction of Class Attributes from Query Logs," IJCAI '07: Proc. 20th Int'l Joint Conf. Artifical Intelligence, pp. 2832-2837, 2007.
[45] G. Pass, A. Chowdhury, and C. Torgeson, "A Picture of Search," Proc. First Int'l Conf. Scalable Information Systems, June 2006.
[46] J.D.M. Rennie and N. Srebro, "Fast Maximum Margin Matrix Factorization for Collaborative Prediction," ICML '05: Proc. 22nd Int'l Conf. Machine Learning, 2005.
[47] P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom, and J. Riedl, "Grouplens: An Open Architecture for Collaborative Filtering of Netnews," CSCW '94: Proc. ACM Conf. Computer Supported Cooperative Work, 1994.
[48] R. Salakhutdinov and A. Mnih, "Bayesian Probabilistic Matrix Factorization Using Markov Chain Monte Carlo," ICML '05: Proc. 25th Int'l Conf. Machine Learning, 2008.
[49] R. Salakhutdinov and A. Mnih, "Probabilistic Matrix Factorization," Advances in Neural Information Processing Systems, vol. 20, pp. 1257-1264, 2008.
[50] B. Sarwar, G. Karypis, J. Konstan, and J. Reidl, "Item-Based Collaborative Filtering Recommendation Algorithms," WWW '01: Proc. 10th Int'l Conf. World Wide Web, pp. 285-295, 2001.
[51] D. Shen, M. Qin, W. Chen, Q. Yang, and Z. Chen, "Mining Web Query Hierarchies from Clickthrough Data," AAAI '07: Proc. 22nd Nat'l Conf. Artificial Intelligence, pp. 341-346, 2007.
[52] L. Si and R. Jin, "Flexible Mixture Model for Collaborative Filtering," ICML '03: Proc. 20th Int'l Conf. Machine Learning, 2003.
[53] C. Silverstein, M.R. Henzinger, H. Marais, and M. Moricz, "Analysis of a Very Large Web Search Engine Query Log," ACM SIGIR Forum, vol. 33, no. 1, pp. 6-12, 1999.
[54] N. Srebro and T. Jaakkola, "Weighted Low-Rank Approximations," ICML '03: Proc. 20th Int'l Conf. Machine Learning, pp. 720-727, 2003.
[55] J.-T. Sun, D. Shen, H.-J. Zeng, Q. Yang, Y. Lu, and Z. Chen, "Web-Page Summarization Using Clickthrough Data," SIGIR '05: Proc. 28th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 194-201, 2005.
[56] M. Theobald, R. Schenkel, and G. Weikum, "Efficient and Self-Tuning Incremental Query Expansion for Top-k Query Processing," SIGIR '05: Proc. 28th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 242-249, 2005.
[57] B. Vélez, R. Weiss, M.A. Sheldon, and D.K. Gifford, "Fast and Effective Query Refinement," ACM SIGIR Forum, vol. 31(SI) pp. 6-15, 1997.
[58] L. von Ahn and L. Dabbish, "Labeling Images with a Computer Game," CHI '04: Proc. SIGCHI Conf. Human Factors in Computing Systems, pp. 319-326, 2004.
[59] X. Wang and C. Zhai, "Learn from Web Search Logs to Organize Search Results," SIGIR '07: Proc. 30th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 87-94, 2007.
[60] J.-R. Wen, J.-Y. Nie, and H. Zhang, "Query Clustering using User Logs," ACM Trans. Information Systems, vol. 20, no. 1, pp. 59-81, 2002.
[61] J. Xu and W.B. Croft, "Query Expansion using Local and Global Document Analysis," SIGIR '07: Proc. 19th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 4-11, 1996.
[62] H. Yang, I. King, and M.R. Lyu, "DiffusionRank: A Possible Penicillin for Web Spamming," SIGIR '07: Proc. 30th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 431-438, 2007.
[63] Y.-H. Yang, P.-T. Wu, C.-W. Lee, K.-H. Lin, W.H. Hsu, and H. Chen, "ContextSeer: Context Search and Recommendation at Query Time for Shared Consumer Photos," Proc. 16th ACM Int'l Conf. Multimedia, pp. 199-208, 2008.
[64] B. Zhang, H. Li, Y. Liu, L. Ji, W. Xi, W. Fan, Z. Chen, and W.-Y. Ma, "Improving Web Search Results Using Affinity Graph," SIGIR '05: Proc. 30th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 504-511, 2005.

Index Terms:
Recommendation, diffusion, query suggestion, image recommendation.
Citation:
Hao Ma, Irwin King, Michael Rung-Tsong Lyu, "Mining Web Graphs for Recommendations," IEEE Transactions on Knowledge and Data Engineering, vol. 24, no. 6, pp. 1051-1064, June 2012, doi:10.1109/TKDE.2011.18
Usage of this product signifies your acceptance of the Terms of Use.