A Set-Covering-Based Approach for Overlapping Resource Selection in Distributed Information Retrieval
Computer Science and Information Engineering, World Congress on (2009)
Los Angeles, California USA
Mar. 31, 2009 to Apr. 2, 2009
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/CSIE.2009.702
Resource selection, also called server selection, collection selection or database selection, is a foundational problem in distributed information retrieval (DIR). This paper introduces a set-covering-based algorithm for resource selection in DIR, with consideration of overlapping extent between resources. Give different document with different weight according to its position in merged results for question Q. Only results that have not appeared in some earlier selected resource are focused on in later selected resources. The score of each resource is decided by the total weights of those merged results included in, and only the resource with max score is selected in each selecting step. So, the selecting order is the actual rank of selected resources which are used to search the question Q’, which is similar to question Q. The approach saves big searching time due to overlapping between databases and, at the same time, enhances user's recall rate and precision.
Resource selection, Set-covering-based algorithm, Distributed information retrieval
S. Ju and X. Wang, "A Set-Covering-Based Approach for Overlapping Resource Selection in Distributed Information Retrieval," 2009 WRI World Congress on Computer Science and Information Engineering, CSIE(CSIE), Los Angeles, CA, 2009, pp. 272-276.