The Community for Technology Leaders
RSS Icon
Issue No.04 - April (2011 vol.23)
pp: 585-599
Zhisheng Li , Singapore Management University, Singapore
Ken C.K. Lee , University of Massachusetts, Dartmouth
Baihua Zheng , Singapore Management University, Singapore
Wang-Chien Lee , Pennsylvania State University, University Park
Dik Lun Lee , Hong Kong University of Science and Technology, Hong Kong
Xufa Wang , University of Science and Technology of China, Hefei
Given a geographic query that is composed of query keywords and a location, a geographic search engine retrieves documents that are the most textually and spatially relevant to the query keywords and the location, respectively, and ranks the retrieved documents according to their joint textual and spatial relevances to the query. The lack of an efficient index that can simultaneously handle both the textual and spatial aspects of the documents makes existing geographic search engines inefficient in answering geographic queries. In this paper, we propose an efficient index, called IR-tree, that together with a top-k document search algorithm facilitates four major tasks in document searches, namely, 1) spatial filtering, 2) textual filtering, 3) relevance computation, and 4) document ranking in a fully integrated manner. In addition, IR-tree allows searches to adopt different weights on textual and spatial relevance of documents at the runtime and thus caters for a wide variety of applications. A set of comprehensive experiments over a wide range of scenarios has been conducted and the experiment results demonstrate that IR-tree outperforms the state-of-the-art approaches for geographic document searches.
Geographic document search, index, search algorithm and IR-tree.
Zhisheng Li, Ken C.K. Lee, Baihua Zheng, Wang-Chien Lee, Dik Lun Lee, Xufa Wang, "IR-Tree: An Efficient Index for Geographic Document Search", IEEE Transactions on Knowledge & Data Engineering, vol.23, no. 4, pp. 585-599, April 2011, doi:10.1109/TKDE.2010.149
[1] E. Amitay, N. Har'El, R. Sivan, and A. Soffer, "Web-a-Where: Geotagging Web Content," Proc. ACM SIGIR '04, pp. 273-280, 2004.
[2] V.N. Anh, O.d. Kretser, and A. Moffat, "Vector-Space Ranking with Effective Early Termination," Proc. ACM SIGIR '01, pp. 35-42, 2001.
[3] V.N. Anh and A. Moffat, "Pruned Query Evaluation Using Pre-Computed Impacts," Proc. ACM SIGIR '06, pp. 372-379, 2006.
[4] R. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval. Addison-Wesley, 1999.
[5] Y.-Y. Chen, T. Suel, and A. Markowetz, "Efficient Query Processing in Geographic Web Search Engines," Proc. ACM SIGMOD '06, pp. 277-288, 2006.
[6] Dow Jones Factiva, http:/, 2010.
[7] R. Fagin, A. Lotem, and M. Naor, "Optimal Aggregation Algorithms for Middleware," Proc. Symp. Principles of Database Systems (PODS '01), pp. 102-113, 2001.
[8] I.D. Felipe, V. Hristidis, and N. Rishe, "Keyword Search on Spatial Databases," Proc. IEEE 24th Int'l Conf. Data Eng. (ICDE '08), pp. 656-665, 2008.
[9] V. Gaede and O. Günther, "Multidimensional Access Methods," ACM Computing Survey, vol. 30, no. 2, pp. 170-231, 1998.
[10] U. Gäuntzer, W.-T. Balke, and W. Kiessling, "Optimizing Multi-Feature Queries for Image Databases," Proc. Int'l Conf. Very Large Data Bases (VLDB '00), pp. 419-428, 2000.
[11] A. Guttman, "R-Trees: A Dynamic Index Structure for Spatial Searching," Proc. ACM SIGMOD '84, pp. 47-57, 1984.
[12] R. Hariharan, B. Hore, C. Li, and S. Mehrotra, "Processing Spatial-Keyword (SK) Queries in Geographic Information Retrieval (GIR) Systems," Proc. 19th Int'l Conf. Scientific and Statistical Database Management (SSDBM '07), pp. 16-25, 2007.
[13] D. Hiemstra, "A Probabilistic Justification for Using TF x IDF Term Weighting in Information Retrieval," Int'l J. Digital Libraries, vol. 3, no. 2, pp. 131-139, 2000.
[14] G.R. Hjaltason and H. Samet, "Distance Browsing in Spatial Databases," ACM Trans. Database Systems, vol. 24, no. 2, pp. 265-318, 1999.
[15] C.B. Jones, A.I. Abdelmoty, D. Finch, G. Fu, and S. Vaid, "The SPIRIT Spatial Search Engine: Architecture, Ontologies and Spatial Indexing," Proc. Third Int'l Conf. Geographic Information Science (GIS '04), pp. 125-139, 2004.
[16] K.S. Jones, "A Statistical Interpretation of Term Specificity and Its Application in Retrieval," J. Documentation, vol. 28, no. 1, pp. 11-21, 1972.
[17] I. Lazaridis and S. Mehrotra, "Progressive Approximate Aggregate Queries with a Multi-Resolution Tree Structure," Proc. ACM SIGMOD '01, pp. 401-412, 2001.
[18] R. Lee, H. Shiina, H. Takakura, Y.J. Kwon, and Y. Kambayashi, "Optimization of Geographic Area to a Web Page for Two-Dimensional Range Query Processing," Proc. Fourth Int'l Conf. Web Information Systems Eng. Workshops (WISEW '03), pp. 9-17, 2003.
[19] Z. Li, C. Wang, X. Xie, X. Wang, and W.-Y. Ma, "Indexing Implicit Locations for Geographical Information Retrieval," Proc. Third Workshop Geographic Information Retrieval (GIR '06), 2006.
[20] "Los Angeles Times," http:/, 2010.
[21] A. Markowetz, Y.-Y. Chen, T. Suel, X. Long, and B. Seeger, "Design and Implementation of a Geographic Search Engine," Proc. Eighth Int'l Workshop Web and Databases (WebDB), pp. 19-24, 2005.
[22] K.S. McCurley, "Geospatial Mapping and Navigation of the Web," Proc. Int'l Conf. World Wide Web (WWW '01), pp. 221-229, 2001.
[23] A. Ntoulas and J. Cho, "Pruning Policies for Two-Tiered Inverted Index with Correctness Guarantee," Proc. ACM SIGIR '07, pp. 191-198, 2007.
[24] G. Salton and C. Buckley, "Term-Weighting Approaches in Automatic Text Retrieval," Information Processing & Management, vol. 24, no. 5, pp. 513-523, 1988.
[25] S. Shekhar, S. Chawla, S. Ravada, A. Fetterer, X. Liu, and C.-T. Lu, "Spatial Databases—Accomplishments and Research Needs," IEEE Trans. Knowledge and Data Eng. (TKDE), vol. 11, no. 1, pp. 45-55, Jan./Feb. 1999.
[26] Y. Zhou, X. Xie, C. Wang, Y. Gong, and W.-Y. Ma, "Hybrid Index Structures for Location-Based Web Search," Proc. 14th ACM Int'l Conf. Information and Knowledge Management (CIKM '05), pp. 155-162, 2005.
13 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool