1999 International Symposium on Database Applications in Non-Traditional Environments (DANTE'99)
Multilevel Data Clustering for Spatial Join Processing
Kyoto, Japan
November 28-November 30
ISBN: 0-7695-0496-5
The I/O cost of spatial join processing could be very high due to the large sizes of spatial objects and the large number of spatial objects involved. Spatial joins are usually performed by the filter-and-refinement approach. Although there exists a variety of algorithms for realizing the filter step of the join processing for large spatial data sets, not much research has been done to improve the performance of the refinement step. By clustering the output of the filter step, we are able to reduce the total number of times that spatial objects are repeatedly loaded during the refinement step, thus to reduce the I/O cost of the refinement step.In this paper, a multilevel data partitioning approach is proposed to partition objects into clusters for spatial join processing. Whenever the number of objects is greater than a threshold, say a hundred, the objects will be clustered through a multilevel scheme, i.e., first coarsening, then partitioning, and finally uncoarsening back to the original object sets, which can be further partitioned using the known partitioning methods. Experiments have been conducted and the results have shown that our method can save 20 - 35% of I/O cost compared with the cases where no clustering or a little clustering is done.
Index Terms:
Clustering, Optimization, Partition, Spatial join processing
Citation:
Jitian Xiao, Yanchun Zhang, Xiaohua Jia, "Multilevel Data Clustering for Spatial Join Processing," dante, pp.218, 1999 International Symposium on Database Applications in Non-Traditional Environments (DANTE'99), 1999