2013 IEEE 13th International Conference on Data Mining (2009)
Dec. 6, 2009 to Dec. 9, 2009
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICDM.2009.19
We describe a novel application of using data mining and statistical learning methods to automatically monitor and detect abnormal execution traces from console logs in an online setting. Different from existing solutions, we use a two stage detection system. The first stage uses frequent pattern mining and distribution estimation techniques to capture the dominant patterns (both frequent sequences and time duration). The second stage use principal component analysis based anomaly detection technique to identify actual problems. Using real system data from a 203-node Hadoop  cluster, we show that we can not only achieve highly accurate and fast problem detection, but also help operators better understand execution patterns in their system.
console logs, system management, monitoring, problem detection, logs, pattern mining
Ling Huang, David Patterson, Armando Fox, Wei Xu, Michael Jordan, "Online System Problem Detection by Mining Patterns of Console Logs", 2013 IEEE 13th International Conference on Data Mining, vol. 00, no. , pp. 588-597, 2009, doi:10.1109/ICDM.2009.19