2017 IEEE 33rd International Conference on Data Engineering (2017)
San Diego, California, USA
April 19, 2017 to April 22, 2017
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICDE.2017.47
Locality sensitive hashing (LSH) and its variants are widely used for approximate kNN (k nearest neighbor) search in high-dimensional space. The success of these techniques largely depends on the ability of preserving kNN information. Unfortunately, LSH only provides a high probability that nearby points in the original space are projected into nearby region in a new space. This potentially makes many false positives and false negatives resulting from unrelated points. Many extensions of LSH aim to alleviate the above issue by improving the distance preserving ability. In this paper, we abound improving LSH function but propose a novel idea to enhance the performance by transforming the original data to a new space before applying LSH. A preserving-ignoring transformation (PIT) function satisfying some rigorous conditions can be used to convert original points to an interim space with strict distance preserving-ignoring capacity. Based on this property, a linear order is utilized to build an efficient index structure in the interim space. Finally, LSH can be applied to candidate set searched by our index structure for final results. Experiments are conducted and the proposed approach performs better than state-of-the-art methods SK-LSH, DSH and NSH in terms of both accuracy and efficiency.
Compounds, Indexes, Visualization, Buildings, Transforms, Measurement
G. Hu, J. Shao, D. Zhang, Y. Yang and H. T. Shen, "Preserving-Ignoring Transformation Based Index for Approximate k Nearest Neighbor Search," 2017 IEEE 33rd International Conference on Data Engineering(ICDE), San Diego, California, USA, 2017, pp. 91-94.