loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
21st International Conference on Data Engineering Workshops (ICDEW'05)
High-Dimensional Similarity Searches Using A Metric Pseudo-Grid
Tokyo, Japan
April 05-April 08
ISBN: 0-7695-2657-8
Christian Digout, Univ. of Alberta, Canada
Mario A. Nascimento, Univ. of Alberta, Canada
Despite the proposal of numerous tree-based access structures for high dimensional similarity searches, techniques based on a sequential scan have been shown to be simple yet quite efficient alternatives. Given that random accesses to disk are expensive, a linear scan of the (smaller) pre-processed dataset is often much more efficient than even a relatively small number of random disk accesses yielded by tree-based indices. In this paper we present a technique which uses a pseudo-partition of a general metric space analog to the VA-file?s partition of the vector space. The rationale is to use a number of pivot objects in the metric space, each one determining a number of hyper-rings in this space. The intersection of those rings, determine pseudo-cells analog to the VA-file cells in the vector space. In order to speedup query processing the data set is clustered (using any applicable clustering technique). Clusters not intersecting cells intersected by the query region cannot contribute to the answer set. Thus, only a few clusters are searched using an I/O efficient linear scan of the cluster?s data. The proposed technique, which we call the M-GRID, is, by construction, applicable to both general metric spaces and to traditional vector spaces as long as a metric distance function is used. The M-GRID is robust to several parameters and experiments with synthetic and real data sets show that it is able to perform nearest neighbor queries up to 10 times faster than the VA-File.
Citation:
Christian Digout, Mario A. Nascimento, "High-Dimensional Similarity Searches Using A Metric Pseudo-Grid," icdew, pp.1174, 21st International Conference on Data Engineering Workshops (ICDEW'05), 2005
Usage of this product signifies your acceptance of the Terms of Use.