This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Routing Queries through a Peer-to-Peer InfoBeacons Network Using Information Retrieval Techniques
December 2007 (vol. 18 no. 12)
pp. 1754-1765
In the InfoBeacons system, a peer-to-peer network of beacons cooperates to route queries to the best information sources. Many internet sources are unwilling to provide more cooperation than simple searching to aid in the query routing.We adapt techniques from information retrieval to deal with this lack of cooperation. In particular, beacons determine how to route queries based on information cached from sources’ responses to queries. In this paper, we examine alternative architectures for routing queries between beacons and to data sources. We also examine how to improve the routing by probing sources in an informed way to learn about their content. Results of experiments using a beacon network to search 2,500 information sources demonstrates the effectiveness of our system; for example, our techniques require contacting up to 71 percent fewer sources than existing peer-to-peer random walk techniques.

[1] L. Adamic, R. Lukose, A. Puniyani, and B. Huberman, “Search in Power-Law Networks,” Physical Rev. E, vol. 64, pp. 46135-46143, 2001.
[2] A. Arasu and H. Garcia-Molina, “Extracting Structured Data from Web Pages,” Proc. Sigmod Int'l Conf. Management of Data (Sigmod '03), 2003.
[3] W.-T. Balke, W. Nejdl, W. Siberski, and U. Thaden, “Progressive Distributed Top-k Retrieval in Peer-to-Peer Networks,” Proc. Int'l Conf. Data Eng. (ICDE '05), 2005.
[4] G. Barish and K. Obraczka, “World Wide Web Caching: Trends and Techniques,” IEEE Comm. Magazine, May 2000.
[5] M. Bawa, R.J. Bayardo Jr., S. Rajagopalan, and E. Shekita, “Make It Fresh, Make It Quick—Searching a Network of Personal Webservers,” Proc. World Wide Web Conf. (WWW '03), 2003.
[6] P. Bernstein, F. Giunchiglia, A. Kementsietsidis, J. Mylopoulos, L. Serafini, and I. Zaihrayeu, “Data Management for Peer-to-Peer Computing: A Vision,” Proc. Int'l Workshop Web and Databases (WebDB '02), 2002.
[7] B. Bhattacharjee, S. Chawathe, V. Gopalakrishnan, P. Keleher, and B. Silaghi, “Efficient Peer-to-Peer Searches Using Result-Caching,” Proc. Int'l Workshop Peer-to-Peer Systems (IPTPS '03), 2003.
[8] C.M. Bowman, P.B. Danzig, D.R. Hardy, U. Manber, and M.F. Schwartz, “The Harvest Information Discovery and Access System,” Proc. World Wide Web Conf. (WWW '94), 1994.
[9] L. Breslau, P. Cao, L. Fan, G. Phillips, and S. Shenker, “Web Caching and Zipf-Like Distributions: Evidence and Implications,” Proc. INFOCOM, 1999.
[10] J.P. Callan and M.E. Connell, “Query-Based Sampling of Text Databases,” Information Systems, vol. 19, no. 2, pp. 97-130, 2001.
[11] J.B. Caverlee, L. Liu, and D. Buttler, “Probe, Cluster, and Discover: Focused Extraction of Qa-Pagelets from the Deep Web,” Proc. Int'l Conf. Data Eng. (ICDE '04), 2004.
[12] S. Chawathe, H. Garcia-Molina, J. Hammer, K. Ireland, Y. Papakonstantinou, J. Ullman, and J. Widom, “The TSIMMIS Project: Integration of Heterogeneous Information Sources,” Proc. Information Processing Soc. of Japan (IPSJ) Conf., Oct. 1994.
[13] B.F. Cooper, “Guiding Queries to Information Sources with InfoBeacons,” Proc. ACM/IFIP/Usenix Fifth Int'l Middleware Conf., 2004.
[14] M.J. Franklin and M.J. Carey, “Client-Server Caching Revisited,” Proc. Int'l Workshop Distributed Object Management, 1992.
[15] J.C. French, A.L. Powell, J. Callan, C.L. Viles, T. Emmitt, K.J. Prey, and Y. Mou, “Comparing the Performance of Database Selection Algorithms,” Proc. SIGIR Conf. Information Retrieval (SIGIR '99), 1999.
[16] L. Galanis, Y. Wang, S.R. Jeffrey, and D.J. DeWitt, “Locating Data Sources in Large Distributed Systems,” Proc. Conf. Very Large Data Bases (VLDB '03), 2003.
[17] L. Gravano, H. Garcia-Molina, and A. Tomasic, “GlOSS: Text-Source Discovery over the Internet,” ACM Trans. Database Systems, vol. 24, no. 2, pp. 229-264, June 1999.
[18] R. Huebsch, J.M. Hellerstein, N. Lanham, B.T. Loo, S. Shenker, and I. Stoica, “Querying the Internet with PIER,” Proc. Conf. Very Large Data Bases (VLDB '03), 2003.
[19] P. Ipeirotis and L. Gravano, “Distributed Search over the Hidden Web: Hierarchical Database Sampling and Selection,” Proc. Conf. Very Large Data Bases (VLDB '02), 2002.
[20] P. Ipeirotis, A. Ntoulas, J. Cho, and L. Gravano, “Modeling and Managing Content Changes in Text Databases,” Proc. Int'l Conf. Data Eng. (ICDE '05), 2005.
[21] V. Kalogeraki, D. Gunopulos, and D. Zeinalipour-Yazti, “A Local Search Mechanism for Peer-to-Peer Networks,” Proc. Int'l Conf. Information and Knowledge Management (CIKM '02), 2002.
[22] B.T. Loo, J.M. Hellerstein, R. Huebsch, S. Shenker, and S. Toica, “Enhancing Peer-to-Peer File-Sharing with an Internet-Scale Query Processor,” Proc. Conf. Very Large Data Bases (VLDB '04), 2004.
[23] J. Lu and J. Callan, “Content-Based Retrieval in Hybrid Peer-to-Peer Networks,” Proc. Int'l Conf. Information and Knowledge Management (CIKM '03), 2003.
[24] J. Lu and J. Callan, “Federated Search of Text-Based Digital Libraries in Hierarchical Peer-to-Peer Networks,” Proc. 27th European Conf. Information Retrieval, 2005.
[25] Z. Lu and K.S. McKinley, “Partial Collection Replication versus Caching for Information Retrieval Systems,” Proc. SIGIR Conf. Information Retrieval (SIGIR '00), 2000.
[26] Q. Lv, P. Cao, E. Cohen, K. Li, and S. Shenker, “Search and Replication in Unstructured Peer-to-Peer Networks,” Proc. ACM Int'l Conf. Supercomputing (ICS '02), June 2002.
[27] L. Page and S. Brin, “The Anatomy of a Large-Scale Hypertext Web Search Engine,” Proc. World Wide Web Conf. (WWW '98), 1998.
[28] M.E. Renda and J. Callan, “The Robustness of Content-Based Search in Hierarchical Peer to Peer Networks,” Proc. Int'l Conf. Information and Knowledge Management (CIKM '04), 2004.
[29] P. Reynolds and A. Vahdat, “Efficient Peer-to-Peer Keyword Searching,” Proc. ACM/IFIP/USENIX Int'l Middleware Conf., 2003.
[30] S. Shi, G. Yang, D. Wang, J. Yu, S. Qu, and M. Chen, “Making Peer-to-Peer Keyword Searching Feasible Using Multilevel Partitioning,” Proc. Int'l Workshop Peer-to-Peer Systems, 2004.
[31] I. Stoica, R. Morris, D. Karger, M.F. Kaashoek, and H. Balakrishnan, “Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications,” Proc. ACM SIGCOMM '01, Aug. 2001.
[32] A. Sugiura and O. Etzioni, “Query Routing for Web Search Engines: Architecture and Experiments,” Proc. World Wide Web Conf. (WWW '00), 2000.
[33] C. Tang, Z. Xu, and S. Dwarkadas, “Peer-to-Peer Information Retrieval Using Self-Organizing Semantic Overlay Networks,” Proc. ACM SIGCOMM '03, 2003.
[34] J. Wang and F. Lochovsky, “Data Extraction and Label Assignment for Web Databases,” Proc. World Wide Web Conf. (WWW '03), 2003.
[35] Y. Xie and D. O'Hallaron, “Locality in Search Engine Queries and Its Implications for Caching,” Proc. INFOCOM, 2002.
[36] B. Yang and H. Garcia-Molina, “Efficient Search in Peer-to-Peer Networks,” Proc. Int'l Conf. Distributed Computing Systems (ICDCS '02), July 2002.
[37] B. Yang and H. Garcia-Molina, “Designing a Super-Peer Network,” Proc. Int'l Conf. Data Eng. (ICDE '03), 2003.

Index Terms:
peer-to-peer systems, information search and discovery
Citation:
Sangeetha Seshadri, Brian F. Cooper, "Routing Queries through a Peer-to-Peer InfoBeacons Network Using Information Retrieval Techniques," IEEE Transactions on Parallel and Distributed Systems, vol. 18, no. 12, pp. 1754-1765, Dec. 2007, doi:10.1109/TPDS.2007.1107
Usage of this product signifies your acceptance of the Terms of Use.