This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Making the Threshold Algorithm Access Cost Aware
October 2004 (vol. 16 no. 10)
pp. 1297-1301
Assume a database storing N objects with d numerical attributes or feature values. All objects in the database can be assigned an overall score that is derived from their single feature values (and the feature values of a user-defined query). The problem considered here is then to efficiently retrieve the k objects with minimum (or maximum) overall score. The well-known threshold algorithm (TA) was proposed as a solution to this problem. TA views the database as a set of d sorted lists storing the feature values. Even though TA is optimal with regard to the number of accesses, its overall access cost can be high since, in practice, some list accesses may be more expensive than others. We therefore propose to make TA access cost aware by choosing the next list to access such that the overall cost is minimized. Our experimental results show that this overall cost is close to the optimal cost and significantly lower than the cost of prior approaches.

[1] S. Chaudhuri, An Overview of Query Optimization in Relational Systems Proc. ACM Symp. Principles of Database Systems, pp. 34-43, 1998.
[2] S. Chaudhuri and L. Gravano, Optimizing Queries over Multimedia Repositories Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 91-102, 1996.
[3] R. Fagin, Combining Fuzzy Information from Multiple Systems Proc. ACM Symp. Principles of Database Systems, pp. 216-226, 1996.
[4] R. Fagin, A. Lotem, and M. Naor, Optimal Aggregation Algorithms for Middleware Proc. ACM Symp. Principles of Database Systems, pp. 102-113, 2001.
[5] U. Guntzer, W. Balke, and W. Kiessling, Optimizing Multi-Feature Queries for Image Databases Proc. Int'l Conf. Very Large Data Bases, pp. 419-428, 2000.
[6] A. Guttman, R-Trees: A Dynamic Index Structure for Spatial Searching Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 47-57, 1984.
[7] G.R. Hjaltason and H. Samet, Ranking in Spatial Databases Advances in Spatial Databases Proc. Fourth Int'l Symp. (SSD), pp. 83-95, 1995.
[8] C.A. Lang and A.K. Singh, Modeling High-Dimensional Index Structures Using Sampling Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 389-400, 2001.
[9] C.A. Lang and A.K. Singh, Accelerating High-Dimensional Nearest Neighbor Queries Proc. Int'l Conf. Scientific and Statistical Database Management, pp. 109-118, 2002.
[10] S. Nepal and M.V. Ramakrishna, Query Processing Issues in Image (Multimedia) Databases Proc. Int'l Conf. Data Eng., pp. 22-29, 1999.
[11] R. Weber and S. Blott, An Approximation Based Data Structure for Similarity Search Technical Report 24, ESPRIT project HERMES (no. 9141), Oct. 1997.

Index Terms:
Multifeature query, threshold algorithm, adaptive, cost-awareness.
Citation:
Christian A. Lang, Yuan-Chi Chang, John R. Smith, "Making the Threshold Algorithm Access Cost Aware," IEEE Transactions on Knowledge and Data Engineering, vol. 16, no. 10, pp. 1297-1301, Oct. 2004, doi:10.1109/TKDE.2004.60
Usage of this product signifies your acceptance of the Terms of Use.