Issue No.03 - Third Quarter (2012 vol.5)
Yang Song , IBM Research, Hawthrone
Anca Sailer , Microsoft Research, Redmond
Hidayatullah Shaikh , IBM Research, Hawthorne
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TSC.2011.3
The overwhelming amount of various monitoring and log data generated in multitier IT systems makes problem determination one of the most expensive and labor-intensive tasks in IT Services arena. Particularly the initial step of problem classification is complicated by error propagation making secondary problems surfacing on multiple dependent resources. In this paper, we propose to automate the process of problem classification by leveraging machine learning. The main focus is to categorize the problem a user experiences by recognizing the real root cause specificity leveraging available training data such as monitoring and logs across the systems. We transform the structure of the problem into a hierarchy using an existing taxonomy. We then propose an efficient hierarchical incremental learning algorithm which is capable of adjusting its internal local classifier parameters in realtime. Comparing to the traditional batch learning algorithms, this online solution decreases the computational complexity of the training process by learning from new instances on an incremental fashion. Our approach significantly reduces the memory required to store the training instances. We demonstrate the efficiency of our approach by learning hierarchical problem patterns for several issues occurring in distributed web applications. Experimental results show that our approach substantially outperforms previous methods.
Training, Taxonomy, Training data, Monitoring, Prediction algorithms, Frequency modulation, Machine learning, services computing, Machine learning, artificial intelligence, computing methodologies, services technologies, principles of services
Yang Song, Anca Sailer, Hidayatullah Shaikh, "Hierarchical Online Problem Classification for IT Support Services", IEEE Transactions on Services Computing, vol.5, no. 3, pp. 345-357, Third Quarter 2012, doi:10.1109/TSC.2011.3