loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
5th International Workshop on Web Site Evolution
Using Keyword Extraction for Web Site Clustering
Amsterdam, The Netherlands
September 22-September 22
ISBN: 0-7695-2016-2
Paolo Tonella, ITC-irst
Filippo Ricca, ITC-irst
Emanuele Pianta, ITC-irst
Reverse engineering techniques have the potential to support Web site understanding, by providing views that show the organization of a site and its navigational structure. However, representing each Web page as a node in the diagrams that are recovered from the source code of a Web site leads often to huge and unreadable graphs. Moreover, since the level of connectivity is typically high, the edges in such graphs make the overall result still less usable.
Clustering can be used to produce cohesive groups of pages that are displayed as a single node in reverse engineered diagrams. In this paper, we propose a clustering method based on the automatic extraction of the keywords of a Web page. The presence of common keywords is exploited to decide when it is appropriate to group pages together. A second usage of the keywords is in the automatic labeling of the recovered clusters of pages.
Citation:
Paolo Tonella, Filippo Ricca, Emanuele Pianta, Christian Girardi, "Using Keyword Extraction for Web Site Clustering," wse, pp.41, 5th International Workshop on Web Site Evolution, 2003
Usage of this product signifies your acceptance of the Terms of Use.