loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2009 10th ACIS International Conference on Software Engineering, Artificial Intelligences, Networking and Parallel/Distributed Computing
A New Method for Estimating the Number of Distinct Values over Data Streams
Catholic University of Daegu, Daegu, Korea
May 27-May 29
ISBN: 978-0-7695-3642-2
Virtually all query optimization methods in data stream management system (DSMS) require a means of estimating the number of distinct values of an attribute in a data stream. Accurate assessment of the number of distinct values can be crucial for selecting a good query plan. Due to data streams’ continuous, real-time and unbounded characteristics, data streams may not be stored in limited memory an effective method. Therefore, estimating the number of distinct values over data streams is a more difficult problem. In this paper, combining with data streams’ properties and analyzing BloomFilter, we present a new estimation method based on circular BloomFilter using limited space. We store the distinct values in circular BloomFilter to solve effectively the problem that data streams could not be stored in limited memory. The theoretical analysis and the results of experiment indicate that the estimation method is more feasible and highly effective.
Index Terms:
BloomFilter, Data Streams, the Number of Distinct Values, circular BloomFilter
Citation:
Longjiang Guo, Yingshu Li, Meirui Ren, Zhongzhao Zhang, "A New Method for Estimating the Number of Distinct Values over Data Streams," snpd, pp.71-76, 2009 10th ACIS International Conference on Software Engineering, Artificial Intelligences, Networking and Parallel/Distributed Computing, 2009
Usage of this product signifies your acceptance of the Terms of Use.