This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2010 29th IEEE Symposium on Reliable Distributed Systems
Invariants Based Failure Diagnosis in Distributed Computing Systems
New Delhi, Punjab India
October 31-November 03
ISBN: 978-0-7695-4250-8
This paper presents an instance based approach to diagnosing failures in computing systems. Owing to the fact that a large portion of occurred failures are repeated ones, our method takes advantage of past experiences by storing historical failures in a database and retrieving similar instances in the occurrence of failure. We extract the system ‘invariants’ by modeling consistent dependencies between system attributes during the operation, and construct a network graph based on the learned invariants. When a failure happens, the status of invariants network, i.e., whether each invariant link is broken or not, provides a view of failure characteristics. We use a high dimensional binary vector to store those failure evidences, and develop a novel algorithm to efficiently retrieve failure signatures from the database. Experimental results in a web based system have demonstrated the effectiveness of our method in diagnosing the injected failures.
Index Terms:
Failure Diagnosis, Distributed Systems, Invariants
Citation:
Haifeng Chen, Guofei Jiang, Kenji Yoshihira, Akhilesh Saxena, "Invariants Based Failure Diagnosis in Distributed Computing Systems," srds, pp.160-166, 2010 29th IEEE Symposium on Reliable Distributed Systems, 2010
Usage of this product signifies your acceptance of the Terms of Use.