The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.01 - January (2009 vol.20)
pp: 71-82
Hanhua Chen , Huazhong University of Science and Technology, Wuhan
Hai Jin , Huazhong University of Science and Technology, Wuhan
Yunhao Liu , Hong Kong Universtiy of Science and Technology, Hong Kong
Lionel M. Ni , Hong Kong Universtiy of Science and Technology, Hong Kong
ABSTRACT
By combining an unstructured protocol with a DHT-based index, hybrid Peer-to-Peer (P2P) improves search efficiency in terms of query recall and response time. The key challenge in hybrid search is to estimate the number of peers that can answer a given query. Existing approaches assume that such a number can be directly obtained by computing item popularity. In this work, we show that such an assumption is not always valid, and previous designs cannot distinguish whether items related to a query are distributed in many peers or are in a few peers. To address this issue, we propose QRank, a difficulty-aware hybrid search, which ranks queries by weighting keywords based on term frequency. Using rank values, QRank selects proper search strategies for queries. We conduct comprehensive trace-driven simulations to evaluate this design. Results show that QRank significantly improves the search quality as well as reducing system traffic cost compared with existing approaches.
INDEX TERMS
Distributed architectures, Emerging technologies, Distributed Systems
CITATION
Hanhua Chen, Hai Jin, Yunhao Liu, Lionel M. Ni, "Difficulty-Aware Hybrid Search in Peer-to-Peer Networks", IEEE Transactions on Parallel & Distributed Systems, vol.20, no. 1, pp. 71-82, January 2009, doi:10.1109/TPDS.2008.72
REFERENCES
[1] BRITE, http://www.cs.bu.edubrite, 2008.
[2] Limewire, http:/www.limewire.com, 2008.
[3] WT100G Test Collection, http://ir.dcs.gla.ac.uktest_collections/, 2008.
[4] T. Bu and D. Towsley, “On Distinguishing between Internet Power Law Topology Generators,” Proc. IEEE INFOCOM, 2002.
[5] Y. Chawathe, S. Ratnasamy, and L. Breslau, “Making Gnutella-Like P2P Systems Scalable,” Proc. ACM SIGCOMM, 2003.
[6] H. Chen, H. Jin, Y. Liu, and L.M. Ni, “Difficulty-Aware Hybrid Search in Peer-to-Peer Networks,” Proc. Int'l Conf. Parallel Processing (ICPP), 2007.
[7] H. Chen, H. Jin, J. Wang, L. Chen, Y. Liu, and L.M. Ni, “Efficient Multi-Keyword Search over P2P Web,” Proc. 17th Int'l World Wide Web Conf. (WWW), 2008.
[8] J. Chu, K. Labonte, and B.N. Levine, “Availability and Locality Measurements of Peer-to-Peer File Systems,” Proc. ITCom: Scalability and Traffic Control in IP Networks, 2002.
[9] P. Flajolet and G.N. Martin, “Probabilistic Counting Algorithms for Data Base Applications,” J. Computer and System Sciences, vol. 31, pp. 182-209, 1985.
[10] D. Hawking, N. Craswell, P. Bailey, and K. Griffihs, “Measuring Search Engine Quality,” Information Retrieval, vol. 4, no. 1, pp.33-59, 2001.
[11] H. Jin and H. Chen, “SemreX: Efficient Search in Semantic Overlay for Literature Retrieval,” Future Generation Computer Systems, vol. 24, no. 6, pp. 475-488, 2008.
[12] D. Kempe, A. Dobra, and J. Gehrke, “Gossip-Based Computation of Aggregation Information,” Proc. 44th Ann. IEEE Symp. Foundations of Computer Science (FOCS), 2003.
[13] Y. Liu, X. Liu, L. Xiao, L.M. Ni, and X. Zhang, “Location-Aware Topology Matching in P2P Systems,” Proc. IEEE INFOCOM, 2004.
[14] B.T. Loo, J.M. Hellerstein, R. Huebsch, S. Shenker, and I. Stoica, “Enhancing P2P File-Sharing with an Internet-Scale Query Processor,” Proc. 30th Int'l Conf. Very Large Data Bases (VLDB), 2004.
[15] B.T. Loo, R. Huebsch, I. Stoica, and J.M. Hellerstein, “The Case for a Hybrid P2P Search Infrastructure,” Proc. Third Int'l Workshop Peer-to-Peer Systems (IPTPS), 2004.
[16] S. Nath, P.B. Gibbons, S. Seshan, and Z.R. Anderson, “Synopsis Diffusion for Robust Aggregation in Sensor Networks,” Proc. Second ACM Int'l Conf. Embedded Networked Sensor Systems (SenSys), 2004.
[17] A. Parker-Rhodes and T. Joyce, “A Theory of Word-Distribution Frequency,” Nature, vol. 178, p. 1308, 1956.
[18] D. Qiu and R. Srikant, “Modeling and Performance Analysis of BitTorrent-Like Peer-to-Peer Networks,” Proc. ACM SIGCOMM, 2004.
[19] M. Ripeanu, A. Iamnitchi, and I. Foster, “Mapping the Gnutella Network,” IEEE Internet Computing, vol. 6, no. 1, pp. 50-57, 2002.
[20] G. Salton and C. Buckley, “Term Weighting Approaches in Automatic Text Retrieval,” Information Processing and Management, vol. 24, pp. 513-523, 1988.
[21] S. Saroiu, P. Gummadi, and S. Gribble, “A Measurement Study of Peer-to-Peer File Sharing Systems,” Proc. Multimedia Computing and Networking (MMCN), 2002.
[22] K. Sripanidkulchai, B. Maggs, and H. Zhang, “Efficient Content Location Using Interest-Based Locality in Peer-to-Peer Systems,” Proc. IEEE INFOCOM, 2003.
[23] I. Stoica, R. Morris, D. Karger, F. Kaashoek, and H. Balakrishnan, “Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications,” Proc. ACM SIGCOMM, 2001.
[24] H. Tangmunarunkit, R. Govindan, S. Jamin, S. Shenker, and W. Willinger, “Network Topology Generators: Degree-Based versus Structural,” Proc. ACM SIGCOMM, 2002.
[25] V. Vapnik, The Nature of Statistical Learning Theory. Springer, 1999.
[26] H. Wang and T. Lin, “On Efficiency in Searching Networks,” Proc. IEEE INFOCOM, 2005.
[27] I.H. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques, second ed. Morgan Kaufmann, 2005.
[28] L. Xiao, Z. Xu, and X. Zhang, “Low-Cost and Reliable Mutual Anonymity Protocols in Peer-to-Peer Networks,” IEEE Trans. Parallel and Distributed Systems, vol. 14, no. 9, pp. 829-840, 2003.
[29] M. Zaharia and S. Keshav, “Gossip-Based Search Selection in Hybrid Peer-to-Peer Networks,” Proc. Fifth Int'l Workshop Peer-to-Peer Systems (IPTPS), 2006.
[30] D. Zeinalipour-Yazti, V. Kalogeraki, and D. Gunopulos, “Exploiting Locality for Scalable Information Retrieval in Peer-to-Peer Networks,” Information Systems, vol. 30, pp. 277-298, 2005.
23 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool