Autonomic Computing, International Conference on (2005)
June 13, 2005 to June 16, 2005
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICAC.2005.11
David Breitgand , IBM Haifa Research Lab
Ealan Henis , IBM Haifa Research Lab
Onn Shehory , IBM Haifa Research Lab
Threshold violations reported for system components signal undesirable conditions in the system. In complex computer systems, characterized by dynamically changing workload patterns and evolving business goals, the precomputed performance thresholds on the operational values of performance metrics of individual system components are not available. This paper focuses on a fundamental enabling technology for performance management: automatic computation and adaptation of statistically meaningful performance thresholds for system components. We formally define the problem of adaptive threshold setting with controllable accuracy of the thresholds and propose a novel algorithm for solving it. Given a set of Service Level Objectives (SLOs) of the applications executing in the system, our algorithmcontinually adapts the per-component performance thresholds to the observed SLO violations. The purpose of this continual threshold adaptation is to control the average amounts of false positive and false negative alarms to improve the efficacy of the threshold-based management. We implemented the proposed algorithm and applied it to a relatively simple, albeit non-trivial, storage system. In our experiments we achieved a positive predictive value of 92% and a negative predictive value of 93% for component level performance thresholds.
O. Shehory, E. Henis and D. Breitgand, "Automated and Adaptive Threshold Setting: Enabling Technology for Autonomy and Self-Management," Autonomic Computing, International Conference on(ICAC), Seattle, Washington, 2005, pp. 204-215.