The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.10 - October (1998 vol.9)
pp: 972-986
ABSTRACT
<p><b>Abstract</b>—<it>Diskless Checkpointing</it> is a technique for checkpointing the state of a long-running computation on a distributed system without relying on stable storage. As such, it eliminates the performance bottleneck of traditional checkpointing on distributed systems. In this paper, we motivate diskless checkpointing and present the basic diskless checkpointing scheme along with several variants for improved performance. The performance of the basic scheme and its variants is evaluated on a high-performance network of workstations and compared to traditional disk-based checkpointing. We conclude that diskless checkpointing is a desirable alternative to disk-based checkpointing that can improve the performance of distributed applications in the face of failures.</p>
INDEX TERMS
Fault tolerance, checkpointing, rollback recovery, memory redundancy, error-correcting codes, copy-on-write, RAID systems.
CITATION
James S. Plank, Kai Li, Michael A. Puening, "Diskless Checkpointing", IEEE Transactions on Parallel & Distributed Systems, vol.9, no. 10, pp. 972-986, October 1998, doi:10.1109/71.730527
28 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool