|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
2010 Third International Symposium on Intelligent Information Technology and Security Informatics
A Focused Crawler Based on Naive Bayes Classifier
Jinggangshan, China
April 02-April 04
ISBN: 978-0-7695-4020-7
| ASCII Text | x | ||
| Wenxian Wang, Xingshu Chen, Yongbin Zou, Haizhou Wang, Zongkun Dai, "A Focused Crawler Based on Naive Bayes Classifier," Intelligent Information Technology and Security Informatics, International Symposium on, pp. 517-521, 2010 Third International Symposium on Intelligent Information Technology and Security Informatics, 2010. | |||
| BibTex | x | ||
| @article{ 10.1109/IITSI.2010.30, author = {Wenxian Wang and Xingshu Chen and Yongbin Zou and Haizhou Wang and Zongkun Dai}, title = {A Focused Crawler Based on Naive Bayes Classifier}, journal ={Intelligent Information Technology and Security Informatics, International Symposium on}, volume = {0}, year = {2010}, isbn = {978-0-7695-4020-7}, pages = {517-521}, doi = {http://doi.ieeecomputersociety.org/10.1109/IITSI.2010.30}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - CONF JO - Intelligent Information Technology and Security Informatics, International Symposium on TI - A Focused Crawler Based on Naive Bayes Classifier SN - 978-0-7695-4020-7 SP517 EP521 A1 - Wenxian Wang, A1 - Xingshu Chen, A1 - Yongbin Zou, A1 - Haizhou Wang, A1 - Zongkun Dai, PY - 2010 KW - Focused Crawler KW - Naive Bayes KW - Classifier KW - TF-IDF VL - 0 JA - Intelligent Information Technology and Security Informatics, International Symposium on ER - | |||
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/IITSI.2010.30
The exponential growth of information on the World Wide Web makes it increasingly difficult to discover relevant data about a specific topic. In this case, growing interest is emerging in focused crawler, a program that traverses the Internet by choosing relevant pages to a predefined topic and neglecting those out of concern. A new focused crawler based on Naive Bayes classifier was proposed here, which used an improved TF-IDF algorithm to extract the characteristics of page content and adopted Bayes classifier to compute the page rank. Then the crawler developed was compared with a BFS crawler and a PageRank crawler, and the results show that our crawler has better performance than the PageRank crawler and BFS crawler in harvest ratio.
Index Terms:
Focused Crawler, Naive Bayes, Classifier, TF-IDF
Citation:
Wenxian Wang, Xingshu Chen, Yongbin Zou, Haizhou Wang, Zongkun Dai, "A Focused Crawler Based on Naive Bayes Classifier," iitsi, pp.517-521, 2010 Third International Symposium on Intelligent Information Technology and Security Informatics, 2010
Usage of this product signifies your acceptance of the Terms of Use.
