2010 IEEE 26th International Conference on Data Engineering (ICDE 2010) (2010)
Long Beach, CA, USA
Mar. 1, 2010 to Mar. 6, 2010
Wenjie Zhang , The University of New South Wales, Australia
Xuemin Lin , The University of New South Wales, Australia
Muhammad Aamir Cheema , The University of New South Wales, Australia
Ying Zhang , The University of New South Wales, Australia
Wei Wang , The University of New South Wales, Australia
K Nearest Neighbor search has many applications including data mining, multi-media, image processing, and monitoring moving objects. In this paper, we study the problem of KNN over multi-valued objects. We aim to provide effective and efficient techniques to identify KNN sensitive to relative distributions of objects.We propose to use quantiles to summarize relative-distribution-sensitive K nearest neighbors. Given a query Q and a quantile φ ∈ (0, 1], we firstly study the problem of efficiently computing K nearest objects based on a φ-quantile distance (e.g. median distance) from each object to Q. The second problem is to retrieve the K nearest objects to Q based on overall distances in the “best population” (with a given size specified by φ-quantile) for each object. While the first problem can be solved in polynomial time, we show that the 2nd problem is NP-hard. A set of efficient, novel algorithms have been proposed to give an exact solution for the first problem and an approximate solution for the second problem with the approximation ratio 2. Extensive experiment demonstrates that our techniques are very efficient and effective.
M. A. Cheema, X. Lin, W. Wang, W. Zhang and Y. Zhang, "Quantile-based KNN over multi-valued objects," 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010)(ICDE), Long Beach, CA, USA, 2010, pp. 16-27.