Proceedings 17th International Conference on Data Engineering (2001)
Apr. 2, 2001 to Apr. 6, 2001
Roberto Figueira Santos Filho , University of S?o Paulo at S?o Carlos-Brazil
Agma Traina , University of S?o Paulo at S?o Carlos-Brazil
Caetano Traina Jr. , University of S?o Paulo at S?o Carlos-Brazil
Christos Faloutsos , Carnegie Mellon University
Abstract: Designing a new access method inside a commercial DBMS is cumbersome and expensive. We propose a family of metric access methods that are fast and easy to implement on top of existing access methods, such as sequential scan, R-trees and Slim-trees. The idea is to elect a set of objects as foci, and gauge all other object with their distances from this set. We show how to define the foci set cardinality, how to choose appropriate foci, and how to perform range and nearest-neighbor queries using them, without false dismissals. The foci increase the pruning of distance calculations during the query processing. Furthermore we index the distances from each object to the foci to reduce even triangular inequality comparisons. Experiments on real and synthetic datasets show that our methods match or outperform existing methods. They are up to 10 times faster, and perform up to 10 times fewer distance calculations and disk accesses. In addition, it scale up well, exhibiting sub-linear performance with growing database size.
R. F. Filho, A. Traina, C. Traina Jr. and C. Faloutsos, "Similarity Search without Tears: The OMNI-Family of All-Purpose Access Methods," Proceedings 17th International Conference on Data Engineering(ICDE), Heidelberg, Germany, 2001, pp. 0623.