This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Wildcard Search in Structured Peer-to-Peer Networks
November 2007 (vol. 19 no. 11)
pp. 1524-1540
We address wildcard search in structured peer-to-peer (P2P) networks, which, to our knowledge, has not yet been explored in the literature. We begin by presenting an approach based on some well-known techniques in information retrieval (IR), and discuss why it is not appropriate in a distributed environment. We then present a simple and novel technique to index objects for wildcard search in a fully decentralized manner, along with some search strategies to retrieve objects. Our index scheme, as opposed to a traditional IR approach, can achieve quite balanced loads, avoid hop-spots and single point of failure, reduce storage and maintenance costs, and offer some ranking mechanisms for matching objects. We use the CD records collected in FreeDB (http://freedb.org) as experimental dataset to evaluate our scheme. The results confirm that our index scheme is very effective in balancing the load. Moreover, search efficiency depends on the information given in a query: the more the information, the higher the performance.

[1] Gnutella, http:/www.gnutella.com, 2006.
[2] B. Yang and H. Garcia-Molina, “Improving Search in Peer-to-Peer Systems,” Proc. 22nd Int'l Conf. Distributed Computing Systems (ICDCS '02), pp. 5-14, 2002.
[3] A. Crespo and H. Garcia-Molina, “Routing Indices for Peer-to-Peer Systems,” Proc. 22nd Int'l Conf. Distributed Computing Systems (ICDCS '02), pp. 23-32, 2002.
[4] Q. Lv, P. Cao, E. Cohen, K. Li, and S. Shenker, “Search and Replication in Unstructured Peer-to-Peer Networks,” Proc. 16th Int'l Conf. Supercomputing (ICS '02), pp. 84-95, 2002.
[5] H. Cai and J. Wang, “Foreseer: A Novel, Locality-Aware Peer-to-Peer System Architecture for Keyword Searches,” Proc. Fifth ACM/IFIP/Usenix Int'l Middleware Conf. (Middleware '04), pp. 38-58, 2004.
[6] A.-H. Cheng and Y.-J. Joung, “Probabilistic File Indexing and Searching in Unstructured Peer-to-Peer Networks,” Computer Networks, vol. 50, no. 1, pp. 106-127, 2006.
[7] I. Stoica, R. Morris, R. Karger, F. Kaashoek, and H. Balakrishnan, “Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications,” Proc. 2001 Conf. Applications, Technologies, Architectures, and Protocols for Computer Comm. (SIGCOMM '01), pp. 149-160, 2001.
[8] S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Shenker, “A Scalable Content-Addressable Network,” Proc. 2001 Conf. Applications, Technologies, Architectures, and Protocols for Computer Comm. (SIGCOMM '01), pp. 161-172, 2001.
[9] B.Y. Zhao, L. Huang, S.C. Rhea, J. Stribling, A.D. Joseph, and J.D. Kubiatowicz, “Tapestry: A Resilient Global-Scale Overlay for Service Deployment,” IEEE J. Selected Areas in Comm., special issue on service overlay networks, vol. 22, no. 1, pp. 1-15, 2004.
[10] F. Zhou, L. Zhuang, B.Y. Zhao, L. Huang, A.D. Joseph, and J. Kubiatowics, “Approximate Object Location and Spam Filtering on Peer-to-Peer Systems,” Proc. Fourth ACM/IFIP/Usenix Int'l Middleware Conf. (Middleware '03), pp. 1-20, 2003.
[11] P. Reynolds and A. Vahdat, “Efficient Peer-to-Peer Keyword Searching,” Proc. Fourth ACM/IFIP/Usenix Int'l Middleware Conf. (Middleware '03), pp. 21-40, 2003.
[12] C. Tang and S. Dwarkadas, “Hybrid Global-Local Indexing for Efficient Peer-to-Peer Information Retrieval,” Proc. First Symp. Networked Systems Design and Implementation (NSDI '04), pp. 211-224, 2004.
[13] S. Shi, G. Yang, D. Wang, J. Yu, S. Qu, and M. Chen, “Making Peer-to-Peer Keyword Searching Feasible Using Multi-Level Partitioning,” Proc. Third Int'l Workshop Peer-to-Peer Systems (IPTPS '04), pp. 151-161, 2005.
[14] P. Ganesan, Q. Sun, and H. Garcia-Molina, “Adlib: A Self-Tuning Index for Dynamic Peer-to-Peer Systems,” Proc. 21st Int'l Conf. Data Eng. (ICDE '05), pp. 256-257, 2005.
[15] Y.-J. Joung, C.-T. Fang, and L.-W. Yang, “Keyword Search in DHT-Based Peer-to-Peer Networks,” Proc. 25th Int'l Conf. Distributed Computing Systems (ICDCS '05), pp. 339-348, 2005, see also IEEE J. Selected Areas in Comm., special issue on peer-to-peer communications and applications, vol. 25, no. 1, pp. 46-61, 2007.
[16] L. Garcés-Erice, P.A. Felber, E.W. Biersack, G. Urvoy-Keller, and K.W. Ross, “Data Indexing in Peer-to-Peer DHT Networks,” Proc. 24th Int'l Conf. Distributed Computing Systems (ICDCS '04), pp. 200-208, 2004.
[17] G. Skobeltsyn, M. Hauswirth, and K. Aberer, “Efficient Processing of XPath Queries with Structured Overlay Networks,” Proc. Int'l Conf. Ontologies, Databases and Applications of SEmantics (ODBASE '05), pp. 1243-1260, 2005.
[18] Y.-J. Joung and L.-W. Yang, “KISS: A Simple Prefix Search Scheme in P2P Networks,” Proc. Ninth Int'l Workshop Web and Databases (WebDB '06), pp. 61-66, June 2006.
[19] Y.-J. Joung and L.-W. Yang, “Multi-Dimensional Prefix Search in P2P Networks,” Proc. Sixth IEEE Int'l Conf. Peer-to-Peer Computing (P2P '06), pp. 67-68, Sept. 2006.
[20] A. Andrzejak and Z. Xu, “Scalable, Efficient Range Queries for Grid Information Services,” Proc. Second Int'l Conf. Peer-to-Peer Computing (P2P '02), pp. 33-40, 2002.
[21] B. Awerbuch and C. Scheideler, “Peer-to-Peer Systems for Prefix Search,” Proc. 22nd Ann. Symp. Principles of Distributed Computing (PODC '03), pp. 123-132, 2003.
[22] J. Aspnes, J. Kirsch, and A. Krishnamurthy, “Load Balancing and Locality in Range-Queriable Data Structures,” Proc. 23rd Ann. ACM Symp. Principles of Distributed Computing (PODC '04), pp.115-124, 2004.
[23] S. Ramabhadran, S. Ratnasamy, J.M. Hellerstein, and S. Shenker, “Brief Announcement: Prefix Hash Tree,” Proc. 23rd Ann. ACM Symp. Principles of Distributed Computing (PODC '04), pp. 368, 2004.
[24] A. Datta, M. Hauswirth, R. John, R. Schmidt, and K. Aberer, “Range Queries in Trie-Structured Overlays,” Proc. Fifth IEEE Int'l Conf. Peer-to-Peer Computing (P2P '05), pp. 57-66, 2005.
[25] A.R. Bharambe, M. Agrawal, and S. Seshan, “Mercury: Supporting Scalable Multi-Attribute Range Queries,” Proc. 2004 Conf. Applications, Technologies, Architectures, and Protocols for Computer Comm. (SIGCOMM '04), pp. 353-366, 2004.
[26] G. Salton, Automatic Text Processing. The Transformation, Analysis and Retrieval of Information by Computer. Addison-Wesley, 1989.
[27] M. Harren, J.M. Hellerstein, R. Huebsch, B.T. Loo, S. Shenker, and I. Stoica, “Complex Queries in DHT-Based Peer-to-Peer Networks,” Proc. First Int'l Workshop Peer-to-Peer Systems (IPTPS '02), pp. 242-259, 2002.
[28] G. Salton, The SMART Retrieval System—Experiments in Automatic Document Processing. Prentice Hall, 1971.
[29] S.L. Johnsson and C.-T. Ho, “Optimum Broadcasting and Personalized Communication in Hypercubes,” IEEE Trans. Computers, vol. 38, no. 9, pp. 1249-1268, Sept. 1989.
[30] M. Schlosser, M. Sintek, S. Decker, and W. Nejdl, “HyperCuP: Hypercubes, Ontologies and Efficient Search on P2P Networks,” Proc. First Int'l Workshop Agents and Peer-to-Peer Computing (AP2PC '02), pp. 112-124, 2003.
[31] B. Yang and H. Garcia-Molina, “Designing a Super-Peer Network,” Proc. 19th Int'l Conf. Data Eng. (ICDE '03), pp. 49-60, 2003.
[32] S. Saroiu, P.K. Gummadi, and S.D. Gribble, “A Measurement Study of Peer-to-Peer File Sharing Systems,” Proc. 2002 Multimedia Computing and Networking (MMCN '02). Int'l Soc. Optical Eng., Jan. 2002.
[33] emule-project.net homepage, eMule, http:/www.emule-project. net/, 2006.
[34] P. Maymounkov and D. Mazières, “Kademlia: A Peer-to-Peer Information System Based on the XOR Metric,” Proc. First Int'l Workshop Peer-to-Peer Systems (IPTPS '02), pp. 53-65, 2002.
[35] D.R. Karger and M. Ruhl, “Simple Efficient Load Balancing Algorithms for Peer-to-Peer Systems,” Proc. 16th Ann. ACM Symp. Parallel Algorithms and Architectures (SPAA '04), pp. 36-43, 2004.

Index Terms:
Information Storage and Retrieval, Distributed applications
Citation:
Yuh-Jzer Joung, Li-Wei Yang, "Wildcard Search in Structured Peer-to-Peer Networks," IEEE Transactions on Knowledge and Data Engineering, vol. 19, no. 11, pp. 1524-1540, Nov. 2007, doi:10.1109/TKDE.2007.190641
Usage of this product signifies your acceptance of the Terms of Use.