Issue No.02 - April-June (2006 vol.3)
Tadashi Dohi , IEEE Computer Society
Hiroyuki Okamura , IEEE Computer Society
Naoto Kaio , IEEE
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TDSC.2006.22
In this paper, we consider two kinds of sequential checkpoint placement problems with infinite/finite time horizon. For these problems, we apply approximation methods based on the variational principle and develop computation algorithms to derive the optimal checkpoint sequence approximately. Next, we focus on the situation where the knowledge on system failure is incomplete, i.e., the system failure time distribution is unknown. We develop the so-called min-max checkpoint placement methods to determine the optimal checkpoint sequence under an uncertain circumstance in terms of the system failure time distribution. In numerical examples, we investigate quantitatively the proposed distribution-free checkpoint placement methods, and refer to their potential applicability in practice.
Checkpoint/restart, fault-tolerance, high availability, modeling and prediction, performance evaluation, maintenance, incomplete failure information.
Tatsuya Ozaki, Tadashi Dohi, Hiroyuki Okamura, Naoto Kaio, "Distribution-Free Checkpoint Placement Algorithms Based on Min-Max Principle", IEEE Transactions on Dependable and Secure Computing, vol.3, no. 2, pp. 130-140, April-June 2006, doi:10.1109/TDSC.2006.22