The Community for Technology Leaders
Green Image
Issue No. 10 - Oct. (2014 vol. 26)
ISSN: 1041-4347
pp: 2354-2367
Sitong Liu , Department of Computer Science and TechnologyTsinghua National Laboratory for Information Science and Technology (TNList), Tsinghua University, Beijing, China
Guoliang Li , Department of Computer Science and TechnologyTsinghua National Laboratory for Information Science and Technology (TNList), Tsinghua University, Beijing, China
Jianhua Feng , Department of Computer Science and TechnologyTsinghua National Laboratory for Information Science and Technology (TNList), Tsinghua University, Beijing, China
ABSTRACT
Location-based services have attracted significant attention due to modern mobile phones equipped with GPS devices. These services generate large amounts of spatio-textual data which contain both spatial location and textual descriptions. Since a spatio-textual object may have different representations, possibly because of deviations of GPS or different user descriptions, it calls for efficient methods to integrate spatio-textual data from different sources. In this paper we study a new research problem called spatio-textual similarity join: given two sets of spatio-textual objects, find the similar object pairs. We make the following contributions: (1) We develop a filter-and-refine framework and devise several efficient algorithms. We extend the prefix filter technique to generate spatial and textual signatures for the objects and build inverted index on top of these signatures. Then we generate candidate pairs using the inverted lists of signatures. Finally we refine the candidates and generate the final result. (2) We study how to generate high-quality signatures for spatial information. We develop an MBR-prefix based signature to prune large numbers of dissimilar object pairs. (3) We propose a hybrid signature scheme to support both textual pruning and spatial pruning simultaneously. (4) Experimental results on real and synthetic datasets show that our algorithms achieve high performance and scale well.
INDEX TERMS
visual databases, data integration, filtering theory, Global Positioning System, mobile computing, mobile radio
CITATION

S. Liu, G. Li and J. Feng, "A Prefix-Filter based Method for Spatio-Textual Similarity Join," in IEEE Transactions on Knowledge & Data Engineering, vol. 26, no. 10, pp. 2354-2367, 2014.
doi:10.1109/TKDE.2013.83
436 ms
(Ver 3.3 (11022016))