Issue No. 04 - April (1997 vol. 46)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/12.588067
<p><b>Abstract</b>—In this paper, we consider a load-balancing process allocation method for fault-tolerant multicomputer systems that balances the load before as well as after faults start to degrade the performance of the system. In order to be able to tolerate a single fault, each process (primary process) is duplicated (i.e., has a backup process). The backup process executes on a different processor from the primary, checkpointing the primary process and recovering the process if the primary process fails. In this paper, we formalize the problem of load-balancing process allocation and propose a new process allocation method and analyze the performance of the proposed method. Simulations are used to compare the proposed method with a process allocation method that does not take into account the different load characteristics of the primary and backup processes. While both methods perform well before the occurrence of a fault, only the proposed method maintains a balanced load after the occurrence of such a fault.</p>
Backup process, checkpointing, fault-tolerant multicomputer, load balancing, process allocation.
S. Lee, H. Lee and J. Kim, "Replicated Process Allocation for Load Distribution in Fault-Tolerant Multicomputers," in IEEE Transactions on Computers, vol. 46, no. , pp. 499-505, 1997.