|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
2007 IEEE 23rd International Conference on Data Engineering
Scalable Peer-to-Peer Web Retrieval with Highly Discriminative Keys
Istanbul, Turkey
April 15-April 20
ISBN: 1-4244-0802-4
| ASCII Text | x | ||
| Ivana Podnar, Martin Rajman, Toan Luu, Fabius Klemm, Karl Aberer, "Scalable Peer-to-Peer Web Retrieval with Highly Discriminative Keys," Data Engineering, International Conference on, pp. 1096-1105, 2007 IEEE 23rd International Conference on Data Engineering, 2007. | |||
| BibTex | x | ||
| @article{ 10.1109/ICDE.2007.368968, author = {Ivana Podnar and Martin Rajman and Toan Luu and Fabius Klemm and Karl Aberer}, title = {Scalable Peer-to-Peer Web Retrieval with Highly Discriminative Keys}, journal ={Data Engineering, International Conference on}, volume = {0}, year = {2007}, isbn = {1-4244-0802-4}, pages = {1096-1105}, doi = {http://doi.ieeecomputersociety.org/10.1109/ICDE.2007.368968}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - CONF JO - Data Engineering, International Conference on TI - Scalable Peer-to-Peer Web Retrieval with Highly Discriminative Keys SN - 1-4244-0802-4 SP1096 EP1105 A1 - Ivana Podnar, A1 - Martin Rajman, A1 - Toan Luu, A1 - Fabius Klemm, A1 - Karl Aberer, PY - 2007 KW - null VL - 0 JA - Data Engineering, International Conference on ER - | |||
The suitability of Peer-to-Peer (P2P) approaches for fulltext web retrieval has recently been questioned because of the claimed unacceptable bandwidth consumption induced by retrieval from very large document collections. In this contribution we formalize a novel indexing/retrieval model that achieves high performance, costefficient retrieval by indexing with highly discriminative keys (HDKs) stored in a distributed global index maintained in a structured P2P network. HDKs correspond to carefully selected terms and term sets appearing in a small number of collection documents. We provide a theoretical analysis of the scalability of our retrieval model and report experimental results obtained with our HDK-based P2P retrieval engine. These results show that, despite increased indexing costs, the total traffic generated with the HDK approach is significantly smaller than the one obtained with distributed single-term indexing strategies. Furthermore, our experiments show that the retrieval performance obtained with a random set of real queries is comparable to the one of centralized, single-term solution using the best state-of-the-art BM25 relevance computation scheme. Finally, our scalability analysis demonstrates that the HDK approach can scale to large networks of peers indexing web-size document collections, thus opening the way towards viable, truly-decentralized web retrieval.
Citation:
Ivana Podnar, Martin Rajman, Toan Luu, Fabius Klemm, Karl Aberer, "Scalable Peer-to-Peer Web Retrieval with Highly Discriminative Keys," icde, pp.1096-1105, 2007 IEEE 23rd International Conference on Data Engineering, 2007
Usage of this product signifies your acceptance of the Terms of Use.
