This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Boolean Similarity Measures for Resource Discovery
November-December 1997 (vol. 9 no. 6)
pp. 863-876

Abstract—As the number of Internet servers increases rapidly, it becomes difficult to determine the relevant servers when searching for information. We develop a new method to rank Internet servers for Boolean queries. Our method reduces time and space complexity from exponential to polynomial in the number of Boolean terms. We contrast it with other known methods and describe its implementation.

[1] S.-H. Li and P.B. Danzig, "Vocabulary Problem in Internet Resource Discovery," Proc. Second Int'l Workshop Next Generation Information Technologies and Systems, pp. 139-145,Naharia, Israel, June 1995. Available fromftp://catarina.usc.edu/shlingits.ps.gz.
[2] D.R. Hardy and M.F. Schwartz, "Essence: A Resource Discovery System Based on Semantic File Indexing," Proc. Winter 1993 Usenix Conf., pp. 361-374, Jan. 1993.
[3] B. Kahle and A. Medlar, "An Information System for Corporate Users: Wide Area Information Servers," ConneXions—The Interoperability Report, vol. 5, no. 11, pp. 2-9, 1991.
[4] P.B. Danzig, S.-H. Li, and K. Obraczka, "Distributed Indexing of Autonomous Internet Services," Computing Systems, vol. 5, no. 4, pp. 433-459, 1992.
[5] P.G. Anick, J.D. Brennan, R.A. Flynn, D.R. Hanssen, B. Alvey, and J.M. Robbins, "A Direct Manipulation Interface for Boolean Information Retrieval via Natural Language Query," Proc. 13th Ann. Int'l ACM SIGIR Conf., pp. 135-150,Brussels, Sept. 1990.
[6] D. Young and B. Shneiderman, "A Graphical Filter/Flow Representation of Boolean Queries: A Prototype Implementation and Evaluation," J. Am. Soc. Information Science, vol. 44, no. 6, pp. 327-339, July 1993.
[7] V.I. Frants and J. Shapiro, "Algorithms for Automatic Construction of Query Formulations in Boolean Form," J. Am. Soc. Information Science, vol. 42, no. 1, pp. 16-26, Jan. 1991.
[8] K. Obraczka, P.B. Danzig, and S.-H. Li, "Internet Resource Discovery Services," Computer, vol. 26, no. 9, pp. 8-22, Sept. 1993.
[9] A. Emtage and P. Deutsch, "Archie: An Electronic Directory Service for the Internet," Proc. Winter 1992 Usenix Conf., pp. 93-110, 1992.
[10] M.A. Sheldon, A. Duda, R. Weiss, J.W. O'Toole Jr., and D.K. Gifford, "A Content Routing System for Distributed Information Servers," Proc. Fourth Int'l Conf. Extending Database Technology,Cambridge, England, Mar. 1994.
[11] L. Gravano, H. Garcia-Molina, and A. Tomasic, "The Efficacy of GIOSS for the Text Database Discovery Problem," Technical Report STAN-CS-TN-93-2, Stanford Univ., 1993.
[12] C.J. van Rijsbergen, Information Retrieval. London: Butterworths, 1979.
[13] T. Radecki, "A Model of a Document-Clustering-Based Information Retrieval System with a Boolean Search Request Formulation," Information Retrieval Research, R.N. Oddy, S.E. Robertson, C.J. van Rijsbergen, and P.W. Williams, eds., pp. 334-344.London: Butterworth&Co. (Publishers) Ltd., 1981.
[14] T. Radecki, "Similarity Measures for Boolean Search Request Formulations," J. Am. Soc. Information Science, vol. 33, no. 1, pp. 8-17, 1982.
[15] M. Kendall and J.D. Gibbons, Rank Correlation Methods, fifth ed. London: Edward Ar nold, 1990.
[16] R. Jain, The Art of Computer Systems Performance Analysis.New York: John Wiley&Sons, 1991.
[17] R. Sedgewick,Algorithms. Reading, MA: Addison-Wesley, 1983.
[18] S.-H. Li and P.B. Danzig, "Boolean Similarity Measures for Resource Discovery," Technical Report USC-CS-94-579, Univ. of Southern California, 1994.
[19] D. Goldberg, D. Nichols, B.M. Oki, and D. Terry, "Using Collaborative Filtering to Weave an Information Tapestry," Comm. ACM, vol. 35, no. 12, pp. 61-70, Dec. 1992.
[20] C. Danilowicz, "Modeling of User Preferences and Needs in Boolean Retrieval Systems," Information Processing&Management, vol. 30, no. 3, pp. 363-378, 1994.
[21] T.W. Yan and H. Garcia-Molina, "Index Structures for Selective Dissemination of Information Under the Boolean Model," ACM Trans. Database Systems, vol. 19, no. 2, pp. 332-364, June 1994.
[22] M. Dillon and J. Desper, "The Use of Automatic Relevance Feedback in Boolean Retrieval Systems," J. Documentation, vol. 36, no. 3, pp. 197-208, Sept. 1980.
[23] G. Salton, E.A. Fox, and E.M. Voorhees, "Advanced Feedback Methods in Information Retrieval," J. Am. Soc. Information Science, vol. 36, no. 3, pp. 200-210, May 1985.

Index Terms:
Boolean query, information retrieval, ranking, resource discovery, similarity measure.
Citation:
Shih-Hao Li, Peter B. Danzig, "Boolean Similarity Measures for Resource Discovery," IEEE Transactions on Knowledge and Data Engineering, vol. 9, no. 6, pp. 863-876, Nov.-Dec. 1997, doi:10.1109/69.649313
Usage of this product signifies your acceptance of the Terms of Use.