loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
18th International Conference on Data Engineering (ICDE'02)
A Sampling-Based Estimator for Top-k Query
San Jose, California
February 26-March 01
ISBN: 0-7695-1531-2
Chung-Min Chen, Telcordia Technologies
Yibei Ling, Telcordia Technologies
Top-k queries arise naturally in many database applications that require searching for records whose attribute values are close to those specified in a query. In this paper, we study the problem of processing a top-k query by translating it into an approximate range query that can be efficiently processed by traditional relational DBMSs. We propose a sampling-based approach, along with various query mapping strategies, to determine a range query that yields high recall with low access cost.Our experiments on real-world datasets show that, given the same memory budgets, our sampling-based estimator outperforms a previous histogram-based method in terms of access cost, while achieving the same level of recall. Furthermore, unlike the histogram-based approach, our sampling-based query mapping scheme scales well for high-dimensional data and is easy to implement with low maintenance cost.
Index Terms:
Top-K query, Sampling, Range query
Citation:
Chung-Min Chen, Yibei Ling, "A Sampling-Based Estimator for Top-k Query," icde, pp.0617, 18th International Conference on Data Engineering (ICDE'02), 2002
Usage of this product signifies your acceptance of the Terms of Use.