This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2011 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies
Search Engine Query Clustering Using Top-k Search Results
Lyon France
August 22-August 27
ISBN: 978-0-7695-4513-4
Clustering of search engine queries has attracted significant attention in recent years. Many search engine applications such as query recommendation require query clustering as a pre-requisite to function properly. Indeed, clustering is necessary to unlock the true value of query logs. However, clustering search queries effectively is quite challenging, due to the high diversity and arbitrary input by users. Search queries are usually short and ambiguous in terms of user requirements. Many different queries may refer to a single concept, while a single query may cover many concepts. Existing prevalent clustering methods, such as K-Means or DBSCAN cannot assure good results in such a diverse environment. Agglomerative clustering gives good results but is computationally quite expensive. This paper presents a novel clustering approach based on a key insight -- search engine results might themselves be used to identify query similarity. We propose a novel similarity metric for diverse queries based on the ranked URL results returned by a search engine for queries. This is used to develop a very efficient and accurate algorithm for clustering queries. Our experimental results demonstrate more accurate clustering performance, better scalability and robustness of our approach against known baselines.
Index Terms:
Search egine query clustering, Top-k search results, Clustering validation
Citation:
Yuan Hong, Jaideep Vaidya, Haibing Lu, "Search Engine Query Clustering Using Top-k Search Results," wi-iat, vol. 1, pp.112-119, 2011 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies, 2011
Usage of this product signifies your acceptance of the Terms of Use.