The Community for Technology Leaders
Green Image
Issue No. 06 - June (2013 vol. 24)
ISSN: 1045-9219
pp: 1245-1255
Michael Rung-Tsong Lyu , The Chinese University of Hong Kong, Hong Kong
Yangfan Zhou , The Chinese University of Hong Kong, Shatin
Huaimin Wang , National University of Defense Technology, ChangSha
Haibo Mi , National University of Defense Technology, Changsha
Hua Cai , Alibaba Cloud Computing, Alibaba Inc., Hangzhou
Performance diagnosis is labor intensive in production cloud computing systems. Such systems typically face many real-world challenges, which the existing diagnosis techniques for such distributed systems cannot effectively solve. An efficient, unsupervised diagnosis tool for locating fine-grained performance anomalies is still lacking in production cloud computing systems. This paper proposes CloudDiag to bridge this gap. Combining a statistical technique and a fast matrix recovery algorithm, CloudDiag can efficiently pinpoint fine-grained causes of the performance problems, which does not require any domain-specific knowledge to the target system. CloudDiag has been applied in a practical production cloud computing systems to diagnose performance problems. We demonstrate the effectiveness of CloudDiag in three real-world case studies.
Production, Cloud computing, Electronic mail, Synchronization, Time factors, Data collection, Clocks, request tracing, Cloud computing, performance diagnosis
Michael Rung-Tsong Lyu, Yangfan Zhou, Huaimin Wang, Haibo Mi, Hua Cai, "Toward Fine-Grained, Unsupervised, Scalable Performance Diagnosis for Production Cloud Computing Systems", IEEE Transactions on Parallel & Distributed Systems, vol. 24, no. , pp. 1245-1255, June 2013, doi:10.1109/TPDS.2013.21
90 ms
(Ver 3.3 (11022016))