loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Second International Conference on Semantics, Knowledge, and Grid (SKG'06)
An Efficient Token-based Approach for Web-Snippet Clustering
Guilin, Guangxi, China
November 01-November 03
ISBN: 0-7695-2673-X
Jianchao Li, Shanghai Jiao Tong University, China
Tianfang Yao, Shanghai Jiao Tong University, China
Online clustering of the results returned by search engines becomes prevailing in recent times. It addresses the problem of too many records returned by current search engines, which renders the manual search of actually desired information difficult, especially if the query encompasses several subtopics. Clustering is a useful technique to group records to clusters and thereby make it more convenient to retrieve information of interest. We first propose an innovative approach by using tokens as basic units for clustering, which avoids segmentation for oriental languages and can be applied to any language. Second, we introduce a Directed Probability Graph (DPG) model that identifies meaningful phrases as cluster labels using statistical methods without any external knowledge. The clustering procedure is performed without calculating the similarity between pair-wise documents. As shown by our experiments, our clustering algorithm is very efficient and suitable for online Web-snippet clustering.
Citation:
Jianchao Li, Tianfang Yao, "An Efficient Token-based Approach for Web-Snippet Clustering," skg, pp.13, Second International Conference on Semantics, Knowledge, and Grid (SKG'06), 2006
Usage of this product signifies your acceptance of the Terms of Use.