2017 IEEE 33rd International Conference on Data Engineering (2017)
San Diego, California, USA
April 19, 2017 to April 22, 2017
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICDE.2017.236
Processing large-scale data is typically memory intensive. The current generation of Graphics Processing Units (GPUs) has much lower memory capacity than CPUs which is often a limiting factor in processing large-scale data on GPUs. It is desirable to reduce memory footprint in spatially joining large-scale datasets through query optimization. In this study, we present a parallel selectivity estimation technique for optimizing spatial join processing on GPUs. By integrating the multi-dimensional cumulative histogram structure and the summed-area-table algorithm, our data parallel selectivity estimation technique can be efficiently realized on GPUs. Experiments on spatially joining two sets of Minimum Bounding Boxes (MBBs) derived from real point and polygon data, each with about one million MBBs, have shown that selectivity estimation at four grid levels took less than 1/3 of a second on a Nvidia GTX Titan GPU device. By using the best grid resolution, our technique saves 38.4% memory footprint for the spatial join.
Spatial databases, Estimation, Histograms, Graphics processing units, Query processing, Hardware, Parallel processing
J. Zhang, S. You and L. Gruenwald, "Parallel Selectivity Estimation for Optimizing Multidimensional Spatial Join Processing on GPUs," 2017 IEEE 33rd International Conference on Data Engineering(ICDE), San Diego, California, USA, 2017, pp. 1591-1598.