|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
| ASCII Text | x | ||
| Graham Cormode, S. Muthukrishnan, Wei Zhuang, "What?s Different: Distributed, Continuous Monitoring of Duplicate-Resilient Aggregates on Data Streams," Data Engineering, International Conference on, pp. 57, 22nd International Conference on Data Engineering (ICDE'06), 2006. | |||
| BibTex | x | ||
| @article{ 10.1109/ICDE.2006.173, author = {Graham Cormode and S. Muthukrishnan and Wei Zhuang}, title = {What?s Different: Distributed, Continuous Monitoring of Duplicate-Resilient Aggregates on Data Streams}, journal ={Data Engineering, International Conference on}, volume = {0}, year = {2006}, isbn = {0-7695-2570-9}, pages = {57}, doi = {http://doi.ieeecomputersociety.org/10.1109/ICDE.2006.173}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - CONF JO - Data Engineering, International Conference on TI - What?s Different: Distributed, Continuous Monitoring of Duplicate-Resilient Aggregates on Data Streams SN - 0-7695-2570-9 SP EP A1 - Graham Cormode, A1 - S. Muthukrishnan, A1 - Wei Zhuang, PY - 2006 KW - null VL - 0 JA - Data Engineering, International Conference on ER - | |||
A fundamental issue that arises is that one cannot make the "uniqueness" assumption on observed events which is present in previous works, since widescale monitoring invariably encounters the same events at different points. For example, within the network of an Internet Service Provider packets of the same flow will be observed in different routers; similarly, the same individual will be observed by multiple mobile sensors in monitoring wild animals. Aggregates of interest on such distributed environments must be resilient to duplicate observations.
We study such duplicate-resilient aggregates that measure the extent of the duplication―how many unique observations are there, how many observations are unique―as well as standard holistic aggregates such as quantiles and heavy hitters over the unique items. We present accuracy guaranteed, highly communication-efficient algorithms for these aggregates that work within the time and space constraints of high speed streams. We also present results of a detailed experimental study on both real-life and synthetic data.
