The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.01 - Jan. (2013 vol.25)
pp: 177-191
Xue-Qi Cheng , Chinese Academy of Sciences, Beijing
Pan Du , Chinese Academy of Sciences, Beijing
Jiafeng Guo , Chinese Academy of Sciences, Beijing
Xiaofei Zhu , Chinese Academy of Sciences, Beijing
Yixin Chen , Washington University in St. Louis, St. Louis
ABSTRACT
Ranking is an important problem in various applications, such as Information Retrieval (IR), natural language processing, computational biology, and social sciences. Many ranking approaches have been proposed to rank objects according to their degrees of relevance or importance. Beyond these two goals, diversity has also been recognized as a crucial criterion in ranking. Top ranked results are expected to convey as little redundant information as possible, and cover as many aspects as possible. However, existing ranking approaches either take no account of diversity, or handle it separately with some heuristics. In this paper, we introduce a novel approach, Manifold Ranking with Sink Points (MRSPs), to address diversity as well as relevance and importance in ranking. Specifically, our approach uses a manifold ranking process over the data manifold, which can naturally find the most relevant and important data objects. Meanwhile, by turning ranked objects into sink points on data manifold, we can effectively prevent redundant objects from receiving a high rank. MRSP not only shows a nice convergence property, but also has an interesting and satisfying optimization explanation. We applied MRSP on two application tasks, update summarization and query recommendation, where diversity is of great concern in ranking. Experimental results on both tasks present a strong empirical performance of MRSP as compared to existing ranking approaches.
INDEX TERMS
Manifolds, Diversity reception, Convergence, Turning, Eigenvalues and eigenfunctions, Redundancy, Algorithm design and analysis, query recommendation, Diversity in ranking, manifold ranking with sink points, update summarization
CITATION
Xue-Qi Cheng, Pan Du, Jiafeng Guo, Xiaofei Zhu, Yixin Chen, "Ranking on Data Manifold with Sink Points", IEEE Transactions on Knowledge & Data Engineering, vol.25, no. 1, pp. 177-191, Jan. 2013, doi:10.1109/TKDE.2011.190
REFERENCES
[1] R. Agrawal, S. Gollapudi, A. Halverson, and S. Ieong, "Diversifying Search Results," Proc. Second ACM Int'l Conf. Web Search and Data Mining (WSDM '09), pp. 5-14, 2009.
[2] J. Allan, R. Gupta, and V. Khandelwal, "Temporal Summaries of News Topics," Proc. 24th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '01), pp. 10-18, 2001.
[3] D. Beeferman and A. Berger, "Agglomerative Clustering of a Search Engine Query Log," Proc. Sixth ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 407-416, 2000.
[4] P. Boldi, F. Bonchi, C. Castillo, D. Donato, and S. Vigna, "Query Suggestions Using Query-Flow Graphs," Proc. Workshop Web Search Click Data (WSCD '09), pp. 56-63, 2009.
[5] F. Boudin, M. El-Bèze, and J.-M. Torres-Moreno, "A Scalable MMR Approach to Sentence Scoring for Multi-Document Update Summarization," Proc. Companion Vol.: Posters (Coling '08), pp. 23-26, Aug. 2008.
[6] J. Carbonell and J. Goldstein, "The Use of MMR, Diversity-Based Reranking for Reordering Documents and Producing Summaries," Proc. 21st Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '98), pp. 335-336, 1998.
[7] C.L. Clarke, M. Kolla, G.V. Cormack, O. Vechtomova, A. Ashkan, S. Büttcher, and I. MacKinnon, "Novelty and Diversity in Information Retrieval Evaluation," Proc. 31st Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 659-666, 2008.
[8] H.T. Dang and K. Owczarzak, "Overview of the TAC 2009 Summarization Track (Draft)," Proc. Second Text Analysis Conf. (TAC '09), 2009.
[9] P.G. Doyle and J.L. Snell, Random Walks and Electric Networks, Math. Assoc. of Am., 1984.
[10] P. Du, J. Guo, J. Zhang, and X. Cheng, "Manifold Ranking with Sink Points for Update Summarization," Proc. 19th ACM Conf. Information and Knowledge Management (CIKM '10), 2010.
[11] G. Erkan and D.R. Radev, "Lexrank: Graph-Based Lexical Centrality as Salience in Text Summarization," J. Artificial Int'l Research, vol. 22, no. 1, pp. 457-479, 2004.
[12] T.H. Haveliwala, "Topic-Sensitive Pagerank," Proc. 11th Int'l Conf. World Wide Web, pp. 517-526, 2002.
[13] K. Järvelin and J. Kekäläinen, "Cumulated Gain-Based Evaluation of IR Techniques," ACM Trans. Information Systems, vol. 20, no. 4, pp. 422-446, 2002.
[14] K. Knight and D. Marcu, "Statistics-Based Summarization - Step One: Sentence Compression," Proc. 17th Nat'l Conf. Artificial Intelligence and 12th Conf. Innovative Applications of Artificial Intelligence, pp. 703-710, 2000.
[15] Y. Lan, T.-Y. Liu, Z. Ma, and H. Li, "Generalization Analysis of Listwise Learning-to-Rank Algorithms," Proc. 26th Ann. Int'l Conf. Machine Learning (ICML '09), pp. 577-584, 2009.
[16] L. Li, Z. Yang, L. Liu, and M. Kitsuregawa, "Query-URL Bipartite Based Approach to Personalized Query Recommendation," Proc. 23rd Nat'l Conf. Artificial Intelligence, pp. 1189-1194, 2008.
[17] W. Li, F. Wei, Q. Lu, and Y. He, "PNR2: Ranking Sentences with Positive and Negative Reinforcement for Query-Oriented Update Summarization," Proc. 22nd Int'l Conf. Computational Linguistics (Coling '08), pp. 489-496, Aug. 2008.
[18] C.-Y. Lin, "ROUGE: A Package for Automatic Evaluation of Summaries," Proc. ACL Workshop Text Summarization Branches Out, pp. 74-81, July 2004.
[19] C.-Y. Lin and E. Hovy, "Manual and Automatic Evaluation of Summaries," Proc. ACL-02 Workshop Automatic Summarization, pp. 45-51, 2002.
[20] Q. Mei, J. Guo, and D. Radev, "DivRank: The Interplay of Prestige and Diversity in Information Networks," Proc. 16th ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining (KDD '10), pp. 1009-1018, 2010.
[21] Q. Mei, D. Zhou, and K. Church, "Query Suggestion Using Hitting Time," Proc. 17th ACM Conf. Information and Knowledge Management, pp. 469-477, 2008.
[22] R. Mihalcea and P. Tarau, "TextRank: Bringing Order into Texts," Proc. Conf. Empirical Methods in Natural Language Processing (EMNLP '04), pp. 404-411, July 2004.
[23] J. Otterbacher, G. Erkan, and D. Radev, "Using Random Walks for Question-Focused Sentence Retrieval," Proc. Human Language Technology Conf. and Conf. Empirical Methods in Natural Language Processing, pp. 915-922, Oct. 2005.
[24] D.R. Radev, H. Jing, and M. Budzikowska, "Centroid-Based Summarization of Multiple Documents: Sentence Extraction, Utility-Based Evaluation, and User Studies," Proc. Workshop Automatic Summarization (NAACL-ANLP '00), pp. 21-30, 2000.
[25] D.R. Radev and K.R. McKeown, "Generating Natural Language Summaries from Multiple On-Line Sources," Computational Linguistics, vol. 24, pp. 470-500, Sept. 1998.
[26] S. Roweis and L. Saul, "Nonlinear Dimensionality Reduction by Locally Linear Embedding," Science, vol. 290, pp. 2323-2326, 2000.
[27] R.L. Santos, C. Macdonald, and I. Ounis, "Exploiting Query Reformulations for Web Search Result Diversification," Proc. 19th Int'l Conf. World Wide Web (WWW '10), pp. 881-890, 2010.
[28] J.B. Tenenbaum, V. de Silva, and J.C. Langford, "A Global Geometric Framework for Nonlinear Dimensionality Reduction," Science, vol. 290, pp. 2319-2323, 2000.
[29] X. Wan, "TimedTextRank: Adding the Temporal Dimension to Multi-Document Summarization," Proc. 30th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '07), pp. 867-868, 2007.
[30] X. Wan, J. Yang, and J. Xiao, "Manifold-Ranking Based Topic-Focused Multi-Document Summarization," Proc. 20th Int'l Joint Conf. Artificial Intelligence (IJCAI '07), pp. 2903-2908, Jan. 2007.
[31] F. Wei, W. Li, Q. Lu, and Y. He, "Query-Sensitive Mutual Reinforcement Chain and Its Application in Query-Oriented Multi-Document Summarization," Proc. 31st Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '08), pp. 283-290, 2008.
[32] J.-R. Wen, J.-Y. Nie, and H.-J. Zhang, "Clustering User Queries of a Search Engine," Proc. 10th Int'l Conf. World Wide Web, pp. 162-168, 2001.
[33] C.X. Zhai, W.W. Cohen, and J. Lafferty, "Beyond Independent Relevance: Methods and Evaluation Metrics for Subtopic Retrieval," Proc. 26th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '03), pp. 10-17, 2003.
[34] B. Zhang, H. Li, Y. Liu, L. Ji, W. Xi, W. Fan, Z. Chen, and W.-Y. Ma, "Improving Web Search Results Using Affinity Graph," Proc. 28th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '05), pp. 504-511, 2005.
[35] J. Zhang, X. Cheng, G. Wu, and H. Xu, "AdaSum: An Adaptive Model for Summarization," Proc. 17th ACM Conf. Information and Knowledge Management (CIKM '08), pp. 901-910, 2008.
[36] D. Zhou, O. Bousquet, T.N. Lal, J. Weston, and B. Schölkopf, "Learning with Local and Global Consistency," Advances in Neural Information Processing Systems 16, S. Thrun, L. Saul, and B. Schölkopf, eds., MIT Press, 2004.
[37] D. Zhou, J. Weston, A. Gretton, O. Bousquet, and B. Schölkopf, "Ranking on Data Manifolds," Advances in Neural Information Processing Systems 16. S. Thrun, L. Saul, and B. Schölkopf, eds., MIT Press, 2004.
[38] X. Zhu, A. Goldberg, J. Van Gael, and D. Andrzejewski, "Improving Diversity in Ranking Using Absorbing Random Walks," Proc. NAACL-HLT, pp. 97-104, Apr. 2007.
[39] X. Zhu, J. Guo, and X. Cheng, "Recommending Diverse and Relevant Queries with a Manifold Ranking Based Approach," Proc. Workshop Query Representation and Understanding (SIGIR '10), 2010.
515 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool