loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI'06)
Board Forum Crawling: A Web Crawling Method for Web Forum
Hong Kong, China
December 18-December 22
ISBN: 0-7695-2747-7
Yan Guo, Software Division, ICT, CAS
Kui Li, Software Division, ICT, CAS
Kai Zhang, Software Division, ICT, CAS
Gang Zhang, Software Division, ICT, CAS
We present a new method of Board Forum Crawling to crawl Web forum. This method exploits the organized characteristics of the Web forum sites and simulates human behavior of visiting Web Forums. The method starts crawling from the homepage, and then enters each board of the site, and then crawls all the posts of the site directly. Board Forum Crawling can crawl most meaningful information of a Web forum site efficiently and simply. We experimentally evaluated the effectiveness of the method on real Web forum sites by comparing with the traditional breadth-first crawling. We also used this method in a real project, and 12000 Web forum sites have been crawled successfully. These results show the effectiveness of our method.
Citation:
Yan Guo, Kui Li, Kai Zhang, Gang Zhang, "Board Forum Crawling: A Web Crawling Method for Web Forum," wi, pp.745-748, 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI'06), 2006
Usage of this product signifies your acceptance of the Terms of Use.