This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Abstractions for Node Level Passive Fault Detection in Distributed Systems
June 1983 (vol. 32 no. 6)
pp. 543-550
K.N. Oikonomou, Bell Laboratories
We introduce a scheme for passive node-level fault detection in a distributed system. With each system node associate a low-cost, low-complexity observer which monitors the pattern of incoming and outgoing messages and compares it against an abstracted model of the node's behavior. We develop a fault detection procedure, which is probabilistic because of nondeterminism in the simplified node model. Abstraction reduces model complexity, but renders some errors undetectable by the observer. In the paper we characterize these undetectable errors. Succeeding studies show how to select model abstractions to lower the number of undetectable errors.
Index Terms:
fault detection, Concurrent fault detection, distributed systems
Citation:
K.N. Oikonomou, R.Y. Kain, "Abstractions for Node Level Passive Fault Detection in Distributed Systems," IEEE Transactions on Computers, vol. 32, no. 6, pp. 543-550, June 1983, doi:10.1109/TC.1983.1676276
Usage of this product signifies your acceptance of the Terms of Use.