The Community for Technology Leaders
2016 IEEE 32nd International Conference on Data Engineering (ICDE) (2016)
Helsinki, Finland
May 16, 2016 to May 20, 2016
ISBN: 978-1-5090-2020-1
pp: 1576-1577
Huiqi Hu , Department of Computer Science, Tsinghua University, Beijing, China
Guoliang Li , Department of Computer Science, Tsinghua University, Beijing, China
Zhifeng Bao , Computer Science & Info Tech, RMIT University, Australia
Jianhua Feng , Department of Computer Science, Tsinghua University, Beijing, China
Yongwei Wu , Department of Computer Science, Tsinghua University, Beijing, China
Zhiguo Gong , Department of Computer and Information Science, University of Macau, China
Yaoqiang Xu , Department of Computer Science, Tsinghua University, Beijing, China
ABSTRACT
With the rapid development of mobile Internet technology, Internet users are shifting from desktop to mobile devices. Modern mobile devices (e.g., smartphones and tablets) are equipped with GPS, which can help users to easily obtain their locations, and location-based services (LBS) have been widely deployed. LBS users are generating more and more spatio-textual data which contains both textual descriptions and geographical locations. In user-generated data, a spatiotextual entity may have different representations, possibly due to GPS deviations or typographical errors [6], [2], and it calls for effective methods to integrate the spatio-textual data from different data sources. A spatio-textual similarity join is an important operation in spatio-textual data integration, which, given two sets of spatio-textual objects, finds all similar pairs from the two sets, where the similarity can be quantified by combining spatial proximity and textual relevancy. There are many applications in spatio-textual similarity joins, e.g., user recommendation in location-based social networks, image duplication detection using spatio-textual tags, spatio-textual advertising, and location-based market analysis [6], [2]. For example, a house rental agency (e.g., rent.com) wants to perform a similarity join on the spatio-textual data of house requirements from renters and the data of house properties from owners. For another example, a startup company, e.g., Factual (factual.com), crawls spatio-textual records to generate points of interest (POIs). As the records are from multiple sources and may contain many duplicates, It needs to run similarity joins to remove the duplicates.
INDEX TERMS
Internet, Mobile handsets, Global Positioning System, Tuning, Upper bound, Computer science, Australia
CITATION

H. Hu et al., "Top-k spatio-textual similarity join," 2016 IEEE 32nd International Conference on Data Engineering (ICDE), Helsinki, Finland, 2016, pp. 1576-1577.
doi:10.1109/ICDE.2016.7498433
260 ms
(Ver 3.3 (11022016))