The Community for Technology Leaders
Parallel Computing in Electrical Engineering, 2004. International Conference on (2004)
Dresden, Germany
Sept. 7, 2004 to Sept. 10, 2004
ISBN: 0-7695-2080-4
pp: 390-393
Pawel Czarnul , Gdansk University of Technology, Poland
Arkadiusz Urbaniak , Gdansk University of Technology, Poland
Marcin Fraczak , Gdansk University of Technology, Poland
Maciej Dyczkowski , Wroclaw University of Technology
Bartlomiej Balcerek , Wroclaw University of Technology
ABSTRACT
While there exist many kernel and user level libraries/systems which support checkpointing working processes and resuming their operations, it is still very difficult to provide an easy-to-use tool to assist checkpointing parallel applications. In this work, we aim at the development of an easy-to-use user-guided library to support checkpointing parallel MPI applications to be executed within the CLUSTERIX environment i.e. a collection of distributed HPC clusters. We propose a programmer-assisted approach with process state packing and unpacking at the code level for SPMD HPC applications. Although the library is in its early stage of development we present checkpoint/restart times and application execution (interrupted by checkpointing) times for the proposed approach compared to the same application linked with the ckpt user level library.
INDEX TERMS
Process Checkpointing, Checkpointing Parallel Applications, Parallel Software Environments
CITATION
Pawel Czarnul, Arkadiusz Urbaniak, Marcin Fraczak, Maciej Dyczkowski, Bartlomiej Balcerek, "Towards Easy-to-Use Checkpointing of MPI Applications within CLUSTERIX", Parallel Computing in Electrical Engineering, 2004. International Conference on, vol. 00, no. , pp. 390-393, 2004, doi:10.1109/PCEE.2004.72
93 ms
(Ver 3.3 (11022016))