The Community for Technology Leaders
RSS Icon
Issue No.07 - July (2013 vol.25)
pp: 1498-1509
Shasha Li , National University of Defense Technology, Beijing
Chin-Yew Lin , Microsoft Research Asia, Beijing
Young-In Song , Microsoft Research Asia, Beijing
Zhoujun Li , Beihang University, Beijing
Comparing one thing with another is a typical part of human decision making process. However, it is not always easy to know what to compare and what are the alternatives. In this paper, we present a novel way to automatically mine comparable entities from comparative questions that users posted online to address this difficulty. To ensure high precision and high recall, we develop a weakly supervised bootstrapping approach for comparative question identification and comparable entity extraction by leveraging a large collection of online question archive. The experimental results show our method achieves F1-measure of 82.5 percent in comparative question identification and 83.3 percent in comparable entity extraction. Both significantly outperform an existing state-of-the-art method. Additionally, our ranking results show highly relevance to user's comparison intents in web.
Reliability, Portable media players, Data mining, Equations, Algorithm design and analysis, Cities and towns, Pattern matching, comparable entity mining, Information extraction, bootstrapping, sequential pattern mining
Shasha Li, Chin-Yew Lin, Young-In Song, Zhoujun Li, "Comparable Entity Mining from Comparative Questions", IEEE Transactions on Knowledge & Data Engineering, vol.25, no. 7, pp. 1498-1509, July 2013, doi:10.1109/TKDE.2011.210
[1] M.E. Califf and R.J. Mooney, "Relational Learning of Pattern-Match Rules for Information Extraction," Proc. 16th Nat'l Conf. Artificial Intelligence and the 11th Innovative Applications of Artificial Intelligence (AAAI '99/IAAI '99), 1999.
[2] C. Cardie, "Empirical Methods in Information Extraction," Artificial Intelligence Magazine, vol. 18, pp. 65-79, 1997.
[3] D. Gusfield, Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge Univ. Press, 1997.
[4] T.H. Haveliwala, "Topic-Sensitive Pagerank," Proc. 11th Int'l Conf. World Wide Web (WWW '02), pp. 517-526, 2002.
[5] G. Jeh and J. Widom, "Scaling Personalized Web Search," Proc. 12th Int'l Conf. World Wide Web (WWW '02), pp. 271-279, 2003.
[6] N. Jindal and B. Liu, "Identifying Comparative Sentences in Text Documents," Proc. 29th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '06), pp. 244-251, 2006.
[7] N. Jindal and B. Liu, "Mining Comparative Sentences and Relations," Proc. 21st Nat'l Conf. Artificial Intelligence (AAAI '06), 2006.
[8] Z. Kozareva, E. Riloff, and E. Hovy, "Semantic Class Learning from the Web with Hyponym Pattern Linkage Graphs," Proc. Ann. Meeting of the Assoc. for Computational Linguistics: Human Language Technologies (ACL-08: HLT), pp. 1048-1056, 2008.
[9] S. Li, C.-Y. Lin, Y.-I. Song, and Z. Li, "Comparable Entity Mining from Comparative Questions," Proc. 48th Ann. Meeting of the Assoc. for Computational Linguistics (ACL '10), 2010.
[10] G. Linden, B. Smith, and J. York, " Recommendations: Item-to-Item Collaborative Filtering," IEEE Internet Computing, vol. 7, no. 1, pp. 76-80, Jan./Feb. 2003.
[11] R.J. Mooney and R. Bunescu, "Mining Knowledge from Text Using Information Extraction," ACM SIGKDD Exploration Newsletter, vol. 7, no. 1, pp. 3-10, 2005.
[12] L. Page, S. Brin, R. Motwani, and T. Winograd, "The PagRank Citation Ranking: Bringing Order to the Web," Stanford Digital Libraries Working Paper, 1998.
[13] D. Radev, W. Fan, H. Qi, H. Wu, and A. Grewal, "Probabilistic Question Answering on the Web," J. Am. Soc. for Information Science and Technology, pp. 408-419, 2002.
[14] D. Ravichandran and E. Hovy, "Learning Surface Text Patterns for a Question Answering System," Proc. 40th Ann. Meeting on Assoc. for Computational Linguistics (ACL '02), pp. 41-47, 2002.
[15] E. Riloff and R. Jones, "Learning Dictionaries for Information Extraction by Multi-Level Bootstrapping," Proc. 16th Nat'l Conf. Artificial Intelligence and the 11th Innovative Applications of Artificial Intelligence Conf. (AAAI '99/IAAI '99), pp. 474-479, 1999.
[16] E. Riloff, "Automatically Generating Extraction Patterns from Untagged Text," Proc. 13th Nat'l Conf. Artificial Intelligence, pp. 1044-1049, 1996.
[17] S. Soderland, "Learning Information Extraction Rules for Semi-Structured and Free Text," Machine Learning, vol. 34, nos. 1-3, pp. 233-272, 1999.
23 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool