First IEEE International Symposium on Cluster Computing and the Grid (CCGrid'01) A Two-Level Checkpoint Algorithm in a Highly-Available Parallel Single Level Store System Brisbane, Australia May 15-May 18 ISBN: 0-7695-1010-8
A Parallel Single Level Store systems (PSLS) integrates a shared virtual memory and a parallel file system. Managing globally the data, they provide programmers of scientific applications with the attractive shared memory programming model combined with a large and efficient file system in a cluster. In this paper, we present a cheap and efficient two-level checkpointing approach enabling a PSLS to tolerate failures. The first level checkpointing algorithm is very efficient and saves data in memory but requires a large amount of memory space. When memories are saturated, an alternative algorithm, saving a checkpoint on disks is implemented. Performance results present the impact of different variants of the checkpointing algorithms.
Citation:
Christine Morin, Renaud Lottiaux, Anne-Marie Kermarrec, "A Two-Level Checkpoint Algorithm in a Highly-Available Parallel Single Level Store System," ccgrid, pp.514, First IEEE International Symposium on Cluster Computing and the Grid (CCGrid'01), 2001 Usage of this product signifies your acceptance of the Terms of Use. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||