The Community for Technology Leaders
2016 IEEE 32nd International Conference on Data Engineering (ICDE) (2016)
Helsinki, Finland
May 16, 2016 to May 20, 2016
ISBN: 978-1-5090-2020-1
pp: 1406-1409
Ahmed S. Abdelhamid , Purdue University, West Lafayette, IN, USA
Mingjie Tang , Purdue University, West Lafayette, IN, USA
Ahmed M. Aly , Purdue University, West Lafayette, IN, USA
Ahmed R. Mahmood , Purdue University, West Lafayette, IN, USA
Thamir Qadah , Purdue University, West Lafayette, IN, USA
Walid G. Aref , Purdue University, West Lafayette, IN, USA
Saleh Basalamah , Umm Al-Qura University, Makkah, KSA
ABSTRACT
Advances in location-based services (LBS) demand high-throughput processing of both static and streaming data. Recently, many systems have been introduced to support distributed main-memory processing to maximize the query throughput. However, these systems are not optimized for spatial data processing. In this demonstration, we showcase Cruncher, a distributed main-memory spatial data warehouse and streaming system. Cruncher extends Spark with adaptive query processing techniques for spatial data. Cruncher uses dynamic batch processing to distribute the queries and the data streams over commodity hardware according to an adaptive partitioning scheme. The batching technique also groups and orders the overlapping spatial queries to enable inter-query optimization. Both the data streams and the offline data share the same partitioning strategy that allows for data co-locality optimization. Furthermore, Cruncher uses an adaptive caching strategy to maintain the frequently-used location data in main memory. Cruncher maintains operational statistics to optimize query processing, data partitioning, and caching at runtime. We demonstrate two LBS applications over Cruncher using real datasets from OpenStreetMap and two synthetic data streams. We demonstrate that Cruncher achieves order(s) of magnitude throughput improvement over Spark when processing spatial data.
INDEX TERMS
Spatial databases, Query processing, Sparks, Optimization, Distributed databases, Throughput, Indexes
CITATION
Ahmed S. Abdelhamid, Mingjie Tang, Ahmed M. Aly, Ahmed R. Mahmood, Thamir Qadah, Walid G. Aref, Saleh Basalamah, "Cruncher: Distributed in-memory processing for location-based services", 2016 IEEE 32nd International Conference on Data Engineering (ICDE), vol. 00, no. , pp. 1406-1409, 2016, doi:10.1109/ICDE.2016.7498356
153 ms
(Ver 3.3 (11022016))