19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)
A Fast Algorithm for Approximate Quantiles in High Speed Data Streams
Banff, Alberta, Canada
July 09-July 11
ISBN: 0-7695-2868-6
We present a fast algorithm for computing approx- imate quantiles in high speed data streams with deter- ministic error bounds. For data streams of size N where N is unknown in advance, our algorithm par- titions the stream into sub-streams of exponentially increasing size as they arrive. For each sub-stream which has a fixed size, we compute and maintain a multi-level summary structure using a novel algorithm. In order to achieve high speed performance, the algo- rithm uses simple block-wise merge and sample oper- ations. Overall, our algorithms for fixed-size streams and arbitrary-size streams have a computational cost of O(N log( \frac{1} { \in } log \in N)) and an average per-element update cost of O(log logN) if \in is fixed.
Citation:
Qi Zhang, Wei Wang, "A Fast Algorithm for Approximate Quantiles in High Speed Data Streams," ssdbm, pp.29, 19th International Conference on Scientific and Statistical Database Management (SSDBM 2007), 2007