Issue No. 02 - February (2003 vol. 52)
Hewijin C. Jiau , IEEE
W. Kent Fuchs , IEEE
Kuo-Feng Ssu , IEEE
<p><b>Abstract</b>—Heterogeneous computing environments, where computers may have different instruction set architectures, data representations, and operating systems, complicate checkpointing and recovery of processes. This paper describes an approach to recovery and an implementation, PREACHES, that provides portable checkpointing of single-process applications in heterogeneous systems using checkpoint propagation. The checkpoint propagation mechanism creates machine-dependent checkpoints for different architectures in the heterogeneous environment. A process is restored on a specific machine with the checkpoint that is appropriate for the architecture. An implementation of PREACHES has been evaluated on a heterogeneous network of workstations, including Sun, HP, and Pentium machines. The experimental results show that PREACHES achieves efficient checkpointing and rapid recovery.</p>
Heterogeneous systems, portable checkpointing, rollback recovery, process migration.
Hewijin C. Jiau, W. Kent Fuchs, Kuo-Feng Ssu, "Process Recovery in Heterogeneous Systems", IEEE Transactions on Computers, vol. 52, no. , pp. 126-138, February 2003, doi:10.1109/TC.2003.1176981