Issue No. 08 - Aug. (2014 vol. 26)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2013.180
Daniel Keren , Department of Computer Science, Haifa University, Haifa, Israel
Guy Sagy , Faculty of Computer Science, Israeli Institute of Technology, Haifa, Israel
Amir Abboud , Faculty of Computer Science, Israeli Institute of Technology, Haifa, Israel
David Ben-David , Faculty of Computer Science, Israeli Institute of Technology, Haifa, Israel
Assaf Schuster , Faculty of Computer Science, Israeli Institute of Technology, Haifa, Israel
Izchak Sharfman , Faculty of Computer Science, Israeli Institute of Technology, Haifa, Israel
Antonios Deligiannakis , Department of Electronic and Computer Engineering, Technical University of Crete, Chania, Greece
Interest in stream monitoring is shifting toward the distributed case. In many applications the data is high volume, dynamic, and distributed, making it infeasible to collect the distinct streams to a central node for processing. Often, the monitoring problem consists of determining whether the value of a global function, defined on the union of all streams, crossed a certain threshold. We wish to reduce communication by transforming the global monitoring to the testing of local constraints, checked independently at the nodes. Geometric monitoring (GM) proved useful for constructing such local constraints for general functions. Alas, in GM the constraints at all nodes share an identical structure and are thus unsuitable for handling heterogeneous streams. Therefore, we propose a general approach for monitoring heterogeneous streams (HGM), which defines constraints tailored to fit the data distributions at the nodes. While we prove that optimally selecting the constraints is NP-hard, we provide a practical solution, which reduces the running time by hierarchically clustering nodes with similar data distributions and then solving simpler optimization problems. We also present a method for efficiently recovering from local violations at the nodes. Experiments yield an improvement of over an order of magnitude in communication relative to GM.
pattern clustering, computational complexity, data handling, optimisation
D. Keren et al., "Geometric Monitoring of Heterogeneous Streams," in IEEE Transactions on Knowledge & Data Engineering, vol. 26, no. 8, pp. 1890-1903, 2014.