This Article 
 Bibliographic References 
 Add to: 
On the Optimal Checkpointing of Critical Tasks and Transaction-Oriented Systems
January 1992 (vol. 18 no. 1)
pp. 72-77

The probability distribution of the overhead caused by the use of the checkpointing rollback recovery technique is evaluated in both cases of a single critical task and of an overall transaction-oriented system. This distribution is obtained in Laplace-Stieltjes transform form, from which all the moments can be easily calculated. Alternatively, inversion methods can be used to evaluate the distribution. The authors propose checkpointing strategies based on the above distribution in order to optimize performance criteria motivated, in the case of critical tasks, by real time constraints, and in the case of transaction-oriented systems, by the need of guaranteeing the users about the maximum system unavailability.

[1] K. M. Chandy, J. C. Brown, C. W. Dissly, and W. R. Uhrig, "Analytic models for rollback and recovery strategies in data base systems,"IEEE Trans. Software Eng., vol. SE-1, pp. 100-110, Mar. 1975.
[2] L. Donatiello, V. Grassi, and S. Tucci, "Availability distribution of rollback recovery systems," inProc. 2nd MCPR Int. Workshop(Rome, Italy), May 1987.
[3] B. S. Garbow, G. Giunta, and J. N. Lyness, "Software for an implementation of Weeks method for the inverse Laplace transform problem,"ACM Trans. Math. Software, vol. 14, no. 2, pp. 163-170, June 1988.
[4] E. Gelenbe, "On the optimum checkpoint interval,"J. ACM, vol. 26, no. 2, pp. 259-270, 1979.
[5] A. Goyal and A. N. Tantawi, "A measure of guaranteed availability and its numerical evaluation,"IEEE Trans. Computers, vol. 37, no. 1, Jan. 1988.
[6] V. Grassi, L. Donatiello, and G. Iazeolla, "Performability Evaluation of Multicomponent Fault-Tolerant Systems,"IEEE Trans. Reliability, Vol. 37, June 1988, pp. 216-222.
[7] C. M. Krishna, K. G. Shin, and Y.-H. Lee, "Optimization criteria for checkpoint placements,"Commun. ACM, vol. 27, no. 10, pp. 1008-1012, Oct. 1984.
[8] V. G. Kulkarni, V. F. Nicola, and K. Trivedi, "The completion time of a job on a multimode system,"Advan. Appl. Probabil., vol. 19, pp. 932-954, 1987.
[9] P. Lecuyer and J. Malenfant, "Computing optimal checkpointing strategies for rollback and recovery systems,"IEEE Trans. Computers, vol. 37, no. 4, pp. 491-496, Apr. 1988.
[10] N. Mikou and S. Tucci, "Analyse et optimisation dune procdure de reprise dans un systA*me de gestion de donnes centralises,"Acta Inform., vol. 12, pp. 321-338, 1979.
[11] K. G. Shin, T. Lin, and Y.-H. Lee, "Optimal checkpointing of real-time tasks,"IEEE Trans. Computers, vol. C-36, no. 11, pp. 1328-1341, Nov. 1987.
[12] E. de Souza e Silva and H. R. Gail, "Calculating cumulative operational time distributions of repairable computer systems,"IEEE Trans. Comput., vol. C-35, pp. 322-332, 1986.

Index Terms:
optimal checkpointing; transaction-oriented systems; probability distribution; overhead; checkpointing rollback recovery technique; single critical task; Laplace-Stieltjes transform form; moments; inversion methods; checkpointing strategies; performance criteria; real time constraints; maximum system unavailability; database management systems; Laplace transforms; optimisation; real-time systems; transaction processing
V. Grassi, L. Donatiello, S. Tucci, "On the Optimal Checkpointing of Critical Tasks and Transaction-Oriented Systems," IEEE Transactions on Software Engineering, vol. 18, no. 1, pp. 72-77, Jan. 1992, doi:10.1109/32.120317
Usage of this product signifies your acceptance of the Terms of Use.