The paper deals with the problem of handling detected faults in computer systems. We present software procedures targeted at fault detection, fault masking and error recovery. They are discussed in the context of standard PC Windows and Linux environments. Various aspects of checkpointing and recovery policies are studied. The presented considerations are illustrated with some experimental results obtained in our fault injection testbench.
Citation:
A. Lesiak, P. Gawkowski, J. Sosnowski, "Error Recovery Problems," depcos-relcomex, pp.270-277, 2nd International Conference on Dependability of Computer Systems (DepCoS-RELCOMEX '07), 2007