Issue No. 09 - September (2008 vol. 30)
We introduce a new probabilistic proximity search algorithm for range and A"-nearest neighbor (A"-NN) searching in both coordinate and metric spaces. Although there exist solutions for these problems, they boil down to a linear scan when the space is intrinsically high dimensional, as is the case in many pattern recognition tasks. This, for example, renders the A"-NN approach to classification rather slow in large databases. Our novel idea is to predict closeness between elements according to how they order their distances toward a distinguished set of anchor objects. Each element in the space sorts the anchor objects from closest to farthest to it and the similarity between orders turns out to be an excellent predictor of the closeness between the corresponding elements. We present extensive experiments comparing our method against state-of-the-art exact and approximate techniques, both in synthetic and real, metric and nonmetric databases, measuring both CPU time and distance computations. The experiments demonstrate that our technique almost always improves upon the performance of alternative techniques, in some cases by a wide margin.
Extraterrestrial measurements, Pattern recognition, Databases, Computer Society, Feature extraction, Information retrieval, Support vector machines, Support vector machine classification, Neural networks, Sequences, Implementation, Data Structures, Data Storage Representations, Indexing methods, Information Storage and Retrieval, Information Search and Retrieval
"Effective Proximity Retrieval by Ordering Permutations," in IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 30, no. , pp. 1, 2008.