Scientific and Statistical Database Management, International Conference on (2007)
Banff, Alberta, Canada
July 9, 2007 to July 11, 2007
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/SSDBM.2007.5
Marcos R. Vieira , George Mason University, USA
Caetano Traina Jr. , University of Sao Paulo at Sao Carlos, Brazil
Agma J. M. Traina , University of Sao Paulo at Sao Carlos, Brazil
Adriano Arantes , University of Sao Paulo at Sao Carlos, Brazil
Christos Faloutsos , Carnegie Mellon University, USA
This paper proposes novel and effective techniques to estimate a radius to answer k-nearest neighbor queries. The first technique targets datasets where it is possible to learn the distribution about the pairwise distances between the elements, generating a global estimation that applies to the whole dataset. The second technique targets datasets where the first technique cannot be employed, generating estimations that depend on where the query center is located. The proposed k-NNF() algorithm combines both techniques, achieving remarkable speedups. Experiments performed on both real and synthetic datasets have shown that the proposed algorithm can accelerate k-NN queries more than 26 times compared with the incremental algorithm and spends half of the total time compared with the traditional k-NN() algorithms.
C. Traina Jr., C. Faloutsos, M. R. Vieira, A. J. Traina and A. Arantes, "Boosting k-Nearest Neighbor Queries Estimating Suitable Query Radii," 2007 International Conference on Scientific and Statistical Database Management(SSDBM), Banff, Alta., 2007, pp. 10.