C. Traina, Jr., A. Traina, C. Faloutsos, B. Seeger, "Fast Indexing and Visualization of Metric Data Sets using SlimTrees," IEEE Transactions on Knowledge and Data Engineering, vol. 14, no. 2, pp. 244260, March/April, 2002.  
Many recent database applications must deal with similarity queries. For such applications, it is important to measure the similarity between two objects using the distance between them. Focusing on this problem, this paper proposes the Slimtree, a new dynamic tree for organizing metric data sets in pages of fixed size. The Slimtree uses the triangle inequality to prune distance calculations needed to answer similarity queries over objects in metric spaces. The proposed insertion algorithm uses new policies to select the nodes where incoming objects are stored. When a node overflows, the Slimtree uses a Minimal Spanning Tree to help with the split. The new insertion algorithm leads to a tree with high storage utilization and improved query performance. The Slimtree is the first metric access method to tackle the problem of overlap between nodes in metric spaces and to propose a technique to minimize it. The proposed “fatfactor” is a way to quantify whether a given tree can be improved and also to compare two trees. We show how to use the fatfactor to achieve accurate estimates of the search performance and also how to improve the performance of a metric tree through the proposed “Slimdown” algorithm. This paper also presents a new tool in the arsenal of resources of Slimtree aimed at visualizing it. Visualization is a powerful tool for interactive data mining and for the visual tracking of the behavior of a tree under updates. Finally, we present a formula to estimate the number of disk accesses in range queries. Results from experiments with real and synthetic data sets show that the new algorithms of the Slimtree lead to performance improvements. These results show that the Slimtree outperforms the Mtree up to 200 percent for range queries. For insertion and split, the MinimalSpanningTreebased algorithm achieves up to 40 times faster insertions. We observed improvements up to 40 percent in range queries after applying the Slimdown algorithm.
