A Distributed Recovery Block Approach to Fault-Tolerant Execution of Application Tasks in Hypercubes
Issue No. 01 - January (1993 vol. 4)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/71.205657
<p>An approach to fault-tolerant execution of real-time application tasks in hypercubes isproposed. The approach is based on the distributed recovery block (DRB) scheme anddoes not require special hardware mechanisms in support of fault tolerance. Each task isassigned to a pair of processors forming a DRB computing station for execution in adual-redundant and self-checking mode. Assignment of all tasks in an application in sucha form is called the full DRB mapping. The DRB scheme was developed as an approach to uniform treatment of hardware and software faults with the effect of fast forwardrecovery. However, if the system developer is concerned with hardware fault possibilitiesonly, then forming DRB stations becomes a mechanical process not burdening theapplication software designer in any way. A procedure for converting an efficientnonredundant task-to-processor mapping into an efficient full DRB mapping is presented.</p>
Index Termsdual redundant mode; task assignment; distributed recovery block; fault-tolerantexecution; real-time application tasks; hypercubes; computing station; self-checkingmode; software faults; fast forward recovery; hardware fault; fault tolerant computing;hypercube networks
A. Kavianpour and K. Kim, "A Distributed Recovery Block Approach to Fault-Tolerant Execution of Application Tasks in Hypercubes," in IEEE Transactions on Parallel & Distributed Systems, vol. 4, no. , pp. 104-111, 1993.