loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
21st International Conference on Data Engineering (ICDE'05)
Range-Efficient Computation of F₀ over Massive Data Streams
Tokyo, Japan
April 05-April 08
ISBN: 0-7695-2285-8
A. Pavan, Iowa State University
Srikanta Tirthapura, Iowa State University

Efficient one-pass computation of F₀, the number of distinct elements in a data stream, is a fundamental problem arising in various contexts in databases and networking. We consider the problem of efficiently estimating F₀ of a data stream where each element of the stream is an interval of integers.

We present a randomized algorithm which gives an (\varepsilon ,\delta ) approximation of F₀, with the following time complexity (n is the size of the universe of the items): (1) The amortized processing time per interval is 0(\log \frac{1}{\delta }Log\frac{n}{\varepsilon }). (2) The time to answer a query for F₀ is 0(\log {1 \mathord{\left/ {\vphantom {1 {\delta )}}} \right. \kern-\nulldelimiterspace} {\delta )}}. The workspace used is 0(\frac{1}{{\varepsilon ^2 }}\log \frac{1}{\delta }\log n) bits.

Our algorithm improves upon a previous algorithm by Bar-Yossef, Kumar and Sivakumar [5], which requires 0(\frac{1}{{\varepsilon ^5 }}\log \frac{1}{\delta }\log ^5 n) processing time per item. Our algorithm can be used to compute the max-dominance norm of a stream of multiple signals, and significantly improves upon the current best bounds due to Cormode and Muthukrishnan [11]. This also provides efficient and novel solutions for data aggregation problems in sensor networks studied by Nath and Gibbons [22] and Considine et. al. [8].

Citation:
A. Pavan, Srikanta Tirthapura, "Range-Efficient Computation of F₀ over Massive Data Streams," icde, pp.32-43, 21st International Conference on Data Engineering (ICDE'05), 2005
Usage of this product signifies your acceptance of the Terms of Use.