loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2004 IEEE/WIC/ACM International Conference on Web Intelligence (WI'04)
Focused Crawling by Learning HMM from User's Topic-specific Browsing
Beijing, China
September 20-September 24
ISBN: 0-7695-2100-2
Hongyu Liu, Dalhousie University, Canada
Evangelos Milios, Dalhousie University, Canada
Jeannette Janssen, Dalhousie University, Canada
A focused crawler is designed to traverse the Web to gather documents on a specific topic. It is not an easy task to predict which links lead to good pages. In this paper, we present a new approach for prediction of the important links to relevant pages based on a learned user model. In particular, we first collect pages that a user visits during a learning session, where the user browses the Web and specifically marks which pages she is interested in. We then examine the semantic content of these pages to construct a concept graph, which is used to learn the dominant content and link structure leading to target pages using a Hidden Markov Model (HMM). Experiments show that with learned HMM from a user's browsing, the crawling performs better than Best-First strategy.
Citation:
Hongyu Liu, Evangelos Milios, Jeannette Janssen, "Focused Crawling by Learning HMM from User's Topic-specific Browsing," wi, pp.732-732, 2004 IEEE/WIC/ACM International Conference on Web Intelligence (WI'04), 2004
Usage of this product signifies your acceptance of the Terms of Use.