2016 IEEE 32nd International Conference on Data Engineering (ICDE) (2016)
May 16, 2016 to May 20, 2016
Antoine Boutet , University of Lyon, LIRIS, CNRS, INSA-Lyon, UMR5205, F-69621, France
Anne-Marie Kermarrec , INRIA, Rennes, France
Nupur Mittal , INRIA, Rennes, France
Francois Taiani , University of Rennes 1, France
K-Nearest-Neighbor (KNN) graphs have emerged as a fundamental building block of many on-line services providing recommendation, similarity search and classification. Constructing a KNN graph rapidly and accurately is, however, a computationally intensive task. As data volumes keep growing, speed and the ability to scale out are becoming critical factors when deploying a KNN algorithm. In this work, we present KIFF, a generic, fast and scalable KNN graph construction algorithm. KIFF directly exploits the bipartite nature of most datasets to which KNN algorithms are applied. This simple but powerful strategy drastically limits the computational cost required to rapidly converge to an accurate KNN solution, especially for sparse datasets. Our evaluation on a representative range of datasets show that KIFF provides, on average, a speed-up factor of 14 against recent state-of-the art solutions while improving the quality of the KNN approximation by 18%.
Measurement, IP networks, Bipartite graph, Art, Motion pictures, Artificial neural networks, Search problems
A. Boutet, A. Kermarrec, N. Mittal and F. Taiani, "Being prepared in a sparse world: The case of KNN graph construction," 2016 IEEE 32nd International Conference on Data Engineering (ICDE), Helsinki, Finland, 2016, pp. 241-252.