Issue No.11 - November (2007 vol.19)
Distributed systems generate large amount of monitoring data such as log files to track their operational status. However, it is hard to correlate such monitoring data effectively across distributed systems and along observation time for system management. In previous work, we proposed a concept named flow intensity to measure the intensity with which internal monitoring data reacts to the volume of user requests. We calculated flow intensity measurements from monitoring data and proposed an algorithm to automatically search constant relationships between flow intensities measured at various points across distributed systems. If such relationships hold all the time, we regard them as invariants of the underlying systems. Invariants can be used to characterize complex systems and support various system management tasks. However, the computational complexity of previous invariant search algorithm is high so that it may not scale well in large systems with thousands of measurements. In this paper, we propose two efficient but approximate algorithms for inferring invariants in large-scale systems. The computational complexity of new randomized algorithms is significantly reduced and experimental results from a real system are also included to demonstrate the accuracy and efficiency of our new algorithms.
Distributed Systems, System Management, Data mining, Time series analysis, Algorithms for data and knowledge management, Analysis of Algorithms and Problem Complexity
Haifeng Chen, Kenji Yoshihira, "Efficient and Scalable Algorithms for Inferring Likely Invariants in Distributed Systems", IEEE Transactions on Knowledge & Data Engineering, vol.19, no. 11, pp. 1508-1523, November 2007, doi:10.1109/TKDE.2007.190648