Leeds, United Kingdom
Oct. 2, 2006 to Oct. 4, 2006
Matthias Wiesmann , Japan Advanced Institute of Science and Technology
Peter Urban , Japan Advanced Institute of Science and Technology
Xavier Defago , Japan Advanced Institute of Science and Technology
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/SRDS.2006.9
In this paper, we present the SNMP-FD service, a novel failure detection service entirely based on the Simple Network Management Protocol (SNMP). This approach promises better interoperability with external tools and failure information sources, including network equipment and cluster management tools. We first show how the SNMP standard can be used to build a failure detection service. We describe the already standardized interfaces that can be reused and introduce the interfaces that need to be added. SNMP is used extensively in the service: for messaging, process status description, configuration, services statistics and delivering failure detection information to applications. We then present our implementation and an evaluation of performance and quality of service.
Matthias Wiesmann, Peter Urban, Xavier Defago, "An SNMP based failure detection service", SRDS, 2006, Reliable Distributed Systems, IEEE Symposium on, Reliable Distributed Systems, IEEE Symposium on 2006, pp. 365-376, doi:10.1109/SRDS.2006.9