The Community for Technology Leaders
2011 IEEE 11th International Conference on Data Mining (2011)
Vancouver, Canada
Dec. 11, 2011 to Dec. 14, 2011
ISSN: 1550-4786
ISBN: 978-0-7695-4408-3
pp: 221-230
ABSTRACT
Understanding search intents of users through their condensed short queries has attracted much attention both in academia and industry. The search intents of users are generally assumed to be associated with various query patterns, such as "MobileName price", where "MobileName" could be any named entity of mobile phone model and this pattern indicates that the user intends to buy a mobile phone. However, discovering the query intent patterns for general search is challenging mainly due to the difficulty in collecting sufficient training data for learning query patterns across a large number of searchable domains. In this work, we propose Cross Domain Random Walk (CDRW) algorithm, which is semi-supervised, to discover the query intent patterns across different domains from search engine click-through log data. Starting with some manually tagged seed queries in one or more independent domains, CDRW takes the query patterns as bridge and propagates the transition probability across domains to collect the query intent patterns among different domains based on the assumption that "users who have similar intent in different but similar domains will have high probability to share similar query patterns across domains". Different from classical random walk algorithms, CDRW walks across different domains to disseminate the shared knowledge in a transfer learning manner. Extensive experiment results on real log data of a commercial search engine well validate the effectiveness and efficiency of the proposed algorithm.
INDEX TERMS
query intent pattern, transfer learning, random walk, semi-supervised learning
CITATION

J. Huang et al., "Cross Domain Random Walk for Query Intent Pattern Mining from Search Engine Log," 2011 IEEE 11th International Conference on Data Mining(ICDM), Vancouver, Canada, 2011, pp. 221-230.
doi:10.1109/ICDM.2011.44
197 ms
(Ver 3.3 (11022016))