The Community for Technology Leaders
Green Image
Issue No. 02 - February (2012 vol. 24)
ISSN: 1041-4347
pp: 295-308
Yansheng Lu , Huazhong University of Science and Technology, Wuhan
Xiaofang Zhou , The University of Queensland, Brisbane
Gabriel Pui Cheong Fung , Arizona State University, Tempe
Hu Xu , Huazhong University of Science and Technology, Wuhan
Shazia Sadiq , The University of Queensland, Brisbane
Ke Deng , The University of Queensland, Brisbane
ABSTRACT
Given a data point set D, a query point set Q, and an integer k, the Group Nearest Group (GNG) query finds a subset \omega (\vert \omega \vert \le k) of points from D such that the total distance from all points in Q to the nearest point in \omega is not greater than any other subset \omega^{\prime } (\vert \omega^{\prime }\vert \le k) of points in D. GNG query is a partition-based clustering problem which can be found in many real applications and is NP-hard. In this paper, Exhaustive Hierarchical Combination (EHC) algorithm and Subset Hierarchial Refinement (SHR) algorithm are developed for GNG query processing. While EHC is capable to provide the optimal solution for k=2, SHR is an efficient approximate approach that combines database techniques with local search heuristic. The processing focus of our approaches is on minimizing the access and evaluation of subsets of cardinality k in D since the number of such subsets is exponentially greater than \vert D\vert. To do that, the hierarchical blocks of data points at high level are used to find an intermediate solution and then refined by following the guided search direction at low level so as to prune irrelevant subsets. The comprehensive experiments on both real and synthetic data sets demonstrate the superiority of SHR in terms of efficiency and quality.
INDEX TERMS
K-median clustering, group nearest group query, group nearest neighbor query.
CITATION
Yansheng Lu, Xiaofang Zhou, Gabriel Pui Cheong Fung, Hu Xu, Shazia Sadiq, Ke Deng, "On Group Nearest Group Query Processing", IEEE Transactions on Knowledge & Data Engineering, vol. 24, no. , pp. 295-308, February 2012, doi:10.1109/TKDE.2010.230
85 ms
(Ver )