Experimental Evaluation of Behavior-Based Failure-Detection Schemes in Real-Time Communication Networks
Issue No. 06 - June (1999 vol. 10)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/71.774910
<p><b>Abstract</b>—Effective detection of failures is essential for reliable communication services. Traditionally, non-real-time computer networks have relied on behavior-based techniques for detecting communication failures. That is, each node uses heartbeats to detect the failure of its neighbors and the end-to-end transport protocol (e.g., TCP) achieves reliable communication by acknowledgment/retransmission. Recently, there has been a growing demand for reliable “real-time” communication, but little research has been done on the failure detection problem. In this paper, we present two behavior-based failure-detection schemes—neighbor detection and end-to-end detection—for reliable real-time communication services and experimentally evaluate their effectiveness. Specifically, we measure and analyze the coverage and latency of these detection schemes through fault-injection experiments. The experimental results have shown that nearly all failures can be detected very quickly by the neighbor detection scheme, while the end-to-end detection scheme uncovers the remaining failures with larger detection latencies.</p>
Real-time communication, network failures, failure detection, fault-injection experiments.
K. G. Shin and S. Han, "Experimental Evaluation of Behavior-Based Failure-Detection Schemes in Real-Time Communication Networks," in IEEE Transactions on Parallel & Distributed Systems, vol. 10, no. , pp. 613-626, 1999.