|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
2010 29th IEEE Symposium on Reliable Distributed Systems
Invariants Based Failure Diagnosis in Distributed Computing Systems
New Delhi, Punjab India
October 31-November 03
ISBN: 978-0-7695-4250-8
| ASCII Text | x | ||
| Haifeng Chen, Guofei Jiang, Kenji Yoshihira, Akhilesh Saxena, "Invariants Based Failure Diagnosis in Distributed Computing Systems," Reliable Distributed Systems, IEEE Symposium on, pp. 160-166, 2010 29th IEEE Symposium on Reliable Distributed Systems, 2010. | |||
| BibTex | x | ||
| @article{ 10.1109/SRDS.2010.26, author = {Haifeng Chen and Guofei Jiang and Kenji Yoshihira and Akhilesh Saxena}, title = {Invariants Based Failure Diagnosis in Distributed Computing Systems}, journal ={Reliable Distributed Systems, IEEE Symposium on}, volume = {0}, year = {2010}, issn = {1060-9857}, pages = {160-166}, doi = {http://doi.ieeecomputersociety.org/10.1109/SRDS.2010.26}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - CONF JO - Reliable Distributed Systems, IEEE Symposium on TI - Invariants Based Failure Diagnosis in Distributed Computing Systems SN - 1060-9857 SP160 EP166 A1 - Haifeng Chen, A1 - Guofei Jiang, A1 - Kenji Yoshihira, A1 - Akhilesh Saxena, PY - 2010 KW - Failure Diagnosis KW - Distributed Systems KW - Invariants VL - 0 JA - Reliable Distributed Systems, IEEE Symposium on ER - | |||
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/SRDS.2010.26
This paper presents an instance based approach to diagnosing failures in computing systems. Owing to the fact that a large portion of occurred failures are repeated ones, our method takes advantage of past experiences by storing historical failures in a database and retrieving similar instances in the occurrence of failure. We extract the system ‘invariants’ by modeling consistent dependencies between system attributes during the operation, and construct a network graph based on the learned invariants. When a failure happens, the status of invariants network, i.e., whether each invariant link is broken or not, provides a view of failure characteristics. We use a high dimensional binary vector to store those failure evidences, and develop a novel algorithm to efficiently retrieve failure signatures from the database. Experimental results in a web based system have demonstrated the effectiveness of our method in diagnosing the injected failures.
Index Terms:
Failure Diagnosis, Distributed Systems, Invariants
Citation:
Haifeng Chen, Guofei Jiang, Kenji Yoshihira, Akhilesh Saxena, "Invariants Based Failure Diagnosis in Distributed Computing Systems," srds, pp.160-166, 2010 29th IEEE Symposium on Reliable Distributed Systems, 2010
Usage of this product signifies your acceptance of the Terms of Use.
