loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Fourth IEEE International Conference on Data Mining (ICDM'04)
The Anatomy of a Hierarchical Clustering Engine for Web-page, News and Book Snippets
Brighton, United Kingdom
November 01-November 04
ISBN: 0-7695-2142-8
Paolo Ferragina, Universit? di Pisa, Italy
Antonio Gull?, Universit? di Pisa, Italy
In this paper, we investigate the web snippet hierarchical clustering problem in its full extent by devising an algorithmic solution, and a software prototype called SnakeT (accessible at http://roquefort.di.unipi.it/), that: (1) draws the snippets from 16 Web search engines, the Amazon collection of books a9.com, the news of Google News and the blogs of Blogline; (2) builds the clusters on-the-fly (ephemeral clustering) in response to a user query without adopting any pre-defined organization in categories; (3) labels the clusters with sentences of variable length, drawn from the snippets and possibly missing some terms, provided they are not too many;
Citation:
Paolo Ferragina, Antonio Gull?, "The Anatomy of a Hierarchical Clustering Engine for Web-page, News and Book Snippets," icdm, pp.395-398, Fourth IEEE International Conference on Data Mining (ICDM'04), 2004
Usage of this product signifies your acceptance of the Terms of Use.