Similarity Search and Applications, International Workshop on (2009)
Prague, Czech Republic
Aug. 29, 2009 to Aug. 30, 2009
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/SISAP.2009.12
A recent probabilistic approach for searching in high dimensional metric spaces is based on predicting the distances between database elements according to how they order their distances towards some set of distinguished elements, called permutants. In the preprocessing phase a set of permutants is chosen, and are sorted (permuted) by their distances against every database element. The permutations form the index. When a query is given, its corresponding permutation is computed, and --- as similar elements will (probably) have a similar permutation --- the database is compared in the order induced by the similarity between permutations. This works well but has relatively high CPU time due to computing the distances between permutations and (partially) sorting the database by the similarity. We improve this by identifying and solving this as another metric space problem. This avoids many distance computations between the permutants. The experimental results show that this works extremely well in practice.
metric space indexing, probabilistic algorithms, indexing permutations
Kimmo Frediksson, Karina Figueroa, "Speeding Up Permutation Based Indexing with Indexing", Similarity Search and Applications, International Workshop on, vol. 00, no. , pp. 107-114, 2009, doi:10.1109/SISAP.2009.12