Cluster Computing and the Grid, IEEE International Symposium on (2006)
May 16, 2006 to May 19, 2006
Cong Du , Illinois Institute of Technology, USA
Xian-He Sun Sun , Illinois Institute of Technology, USA
Group communications are commonly used in parallel and distributed environment. However, existing migration mechanisms do not support group communications. This weakness prevents migrationbased proactive fault tolerance, among others, to be applied to MPI applications. In this study, we propose distributed migration protocols with group membership management to support process migration with group changing. We design and implement a process migration enabling MPI library, named MPIMitten, to verify the protocols and enhance current MPI platforms for reliability and usability. MPI-Mitten is based on MPI standard and can be applied to any MPI-2 implementations. Experimental results show the proposed distributed process migration protocols are solid and the MPI-Mitten system is effective and is uniquely supporting migration-based fault tolerance.
C. Du and X. S. Sun, "MPI-Mitten: Enabling Migration Technology in MPI," Cluster Computing and the Grid, IEEE International Symposium on(CCGRID), Singapore, 2006, pp. 11-18.