21st International Conference on Data Engineering (ICDE'05) Range-Efficient Computation of F₀ over Massive Data Streams Tokyo, Japan April 05-April 08 ISBN: 0-7695-2285-8
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICDE.2005.118
Efficient one-pass computation of F₀, the number of distinct elements in a data stream, is a fundamental problem arising in various contexts in databases and networking. We consider the problem of efficiently estimating F₀ of a data stream where each element of the stream is an interval of integers. We present a randomized algorithm which gives an (\varepsilon ,\delta ) approximation of F₀, with the following time complexity (n is the size of the universe of the items): (1) The amortized processing time per interval is 0(\log \frac{1}{\delta }Log\frac{n}{\varepsilon }). (2) The time to answer a query for F₀ is 0(\log {1 \mathord{\left/ {\vphantom {1 {\delta )}}} \right. \kern-\nulldelimiterspace} {\delta )}}. The workspace used is 0(\frac{1}{{\varepsilon ^2 }}\log \frac{1}{\delta }\log n) bits. Our algorithm improves upon a previous algorithm by Bar-Yossef, Kumar and Sivakumar [5], which requires 0(\frac{1}{{\varepsilon ^5 }}\log \frac{1}{\delta }\log ^5 n) processing time per item. Our algorithm can be used to compute the max-dominance norm of a stream of multiple signals, and significantly improves upon the current best bounds due to Cormode and Muthukrishnan [11]. This also provides efficient and novel solutions for data aggregation problems in sensor networks studied by Nath and Gibbons [22] and Considine et. al. [8].
Citation:
A. Pavan, Srikanta Tirthapura, "Range-Efficient Computation of F₀ over Massive Data Streams," icde, pp.32-43, 21st International Conference on Data Engineering (ICDE'05), 2005 Usage of this product signifies your acceptance of the Terms of Use. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||