Scalable and efficient spatial data management on multi-core CPU and GPU clusters: A preliminary implementation based on Impala
2015 31st IEEE International Conference on Data Engineering Workshops (ICDEW) (2015)
Seoul, South Korea
April 13, 2015 to April 17, 2015
Simin You , Dept. of Computer Science, CUNY Graduate Center, New York, USA
Jianting Zhang , Department of Computer Science, The City College of New York, USA
Le Gruenwald , Dept. of Computer Science, The University of Oklahoma, Norman, USA
Fast increasing volumes of spatial data has made it imperative to develop both scalable and efficient spatial data management techniques by leveraging modern parallel hardware and distributed systems. By integrating a leading open source Big Data system called Impala and our previous work on data parallel designs for spatial indexing and query processing, we have developed ISP-MC+ and ISP-GPU for large-scale spatial data management on computer clusters equipped with multi-core CPUs and Graphics Processing Units (GPUs), respectively. Both ISP-MC+ and ISP-GPU have shown high efficiency and good scalability on a 10-node Amazon EC2 cluster equipped with multi-core CPUs and GPUs. Comparison with a baseline implementation using traditional techniques on a single CPU core have demonstrated orders of magnitude of speedups on a real world dataset with hundreds of millions of point locations.
Graphics processing units, Spatial databases, Query processing, Indexing, Scalability, Big data, Geometry
S. You, J. Zhang and L. Gruenwald, "Scalable and efficient spatial data management on multi-core CPU and GPU clusters: A preliminary implementation based on Impala," 2015 31st IEEE International Conference on Data Engineering Workshops (ICDEW), Seoul, South Korea, 2015, pp. 143-148.