The Community for Technology Leaders
Green Image
Issue No. 07 - July (2010 vol. 59)
ISSN: 0018-9340
pp: 969-980
Yunhao Liu , Hong Kong University of Science and Technology
Hanhua Chen , Huazhong University of Science and Technology, China
Lionel M. Ni , Hong Kong University of Science and Technology
Jun Yan , Microsoft Research Asia
Hai Jin , Huazhong University of Science and Technology, China
ABSTRACT
Previous multikeyword search in DHT-based P2P systems often relies on multiple single keyword search operations, suffering from unacceptable traffic cost and poor accuracy. Precomputing term-set-based index can significantly reduce the cost but needs exponentially growing index size. Based on our observations that 1) queries are typically short and 2) users usually have limited interests, we propose a novel index pruning method, called TSS. By solely publishing the most relevant term sets from documents on the peers, TSS provides comparable search performance with a centralized solution, while the index size is reduced from exponential to the scale of O(nlog(n)). We evaluate this design through comprehensive trace-driven simulations using the TREC WT10G data collection and the query log of a major commercial search engine.
INDEX TERMS
Peer-to-peer, multikeyword searching, ranking.
CITATION
Yunhao Liu, Hanhua Chen, Lionel M. Ni, Jun Yan, Hai Jin, "TSS: Efficient Term Set Search in Large Peer-to-Peer Textual Collections", IEEE Transactions on Computers, vol. 59, no. , pp. 969-980, July 2010, doi:10.1109/TC.2010.81
109 ms
(Ver )