The Community for Technology Leaders
Digital Libraries, Joint Conference on (2004)
Tuscon, AZ, USA
June 7, 2004 to June 11, 2004
ISBN: 1-58113-832-6
pp: 135-141
Michael Chau , The University of Hong Kong, Pokfulam
Jialun Qin , The University of Arizona, Tucson
Yilu Zhou , The University of Arizona, Tucson
Collecting domain-specific documents from the Web using focused crawlers has been considered one of the most important strategies to build digital libraries that serve the scientific community. However, because most focused crawlers use local search algorithms to traverse the Web space, they could be easily trapped within a limited sub-graph of the Web that surrounds the starting URLs and build domain-specific collections that are not comprehensive and diverse enough to scientists and researchers. In this study, we investigated the problems of traditional focused crawlers caused by local search algorithms and proposed a new crawling approach, meta-search enhanced focused crawling, to address the problems. We conducted two user evaluation experiments to examine the performance of our proposed approach and the results showed that our approach could build domain-specific collections with higher quality than traditional focused crawling techniques.
Digital libraries, domain-specific collection building, focused crawling, meta-search, Web search algorithm
