Issue No.02 - February (2002 vol.51)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/12.980004
<p>Message-driven confidence-driven (MDCD) error containment and recovery, a low-cost approach to mitigating the effect of software design faults in distributed embedded systems, is developed for onboard guarded software upgrading for deep-space missions. In this paper, we first describe and verify the MDCD algorithms in which we introduce the notion of "confidence-driven" to complement the "communication-induced" approach employed by a number of existing checkpointing protocols to achieve error containment and recovery efficiency. We then conduct a model-based analysis to show that the algorithms ensure low performance overhead. Finally, we discuss the advantages of the MDCD approach and its potential utility as a general-purpose, low-cost software fault tolerance technique for distributed embedded computing.</p>
Guarded software upgrading, message-driven confidence-driven, global state consistency and recoverability, performance overhead, software fault tolerance, distributed embedded systems.
K.S. Tso, L. Alkalai, S.N. Chau, W.H. Sanders, "Low-Cost Error Containment and Recovery for Onboard Guarded Software Upgrading and Beyond", IEEE Transactions on Computers, vol.51, no. 2, pp. 121-137, February 2002, doi:10.1109/12.980004