The Community for Technology Leaders
2015 IEEE 31st International Conference on Data Engineering (ICDE) (2015)
Seoul, South Korea
April 13, 2015 to April 17, 2015
ISBN: 978-1-4799-7964-6
pp: 1167-1178
Sudip Roy , Google Research, Mountain View, CA 94043, USA
Arnd Christian Konig , Microsoft Research, Redmond, WA 98052, USA
Igor Dvorkin , Microsoft Corp., Redmond, WA 98052, USA
Manish Kumar , Microsoft Corp., Redmond, WA 98052, USA
ABSTRACT
Cloud platforms involve multiple independently developed components, often executing on diverse hardware configurations and across multiple data centers. This complexity makes tracking various key performance indicators (KPIs) and manual diagnosing of anomalies in system behavior both difficult and expensive. In this paper, we describe PerfAugur, an automated system for mining service logs to identify anomalies and help formulate data-driven hypotheses. PerfAugur includes a suite of efficient mining algorithms for detecting significant anomalies in system behavior, along with potential explanations for such anomalies, without the need for an explicit supervision signal. We perform extensive experimental evaluation using both synthetic and real-life data sets, and present detailed case studies showing the impact of this technology on operations of the Windows Azure Service.
INDEX TERMS
Robustness, Aggregates, Data mining, Electric breakdown, Telemetry, Context, Atomic measurements
CITATION
Sudip Roy, Arnd Christian Konig, Igor Dvorkin, Manish Kumar, "PerfAugur: Robust diagnostics for performance anomalies in cloud services", 2015 IEEE 31st International Conference on Data Engineering (ICDE), vol. 00, no. , pp. 1167-1178, 2015, doi:10.1109/ICDE.2015.7113365
89 ms
(Ver 3.3 (11022016))