This Article 
 Bibliographic References 
 Add to: 
Comparative Analysis of Different Models of Checkpointing and Recovery
August 1990 (vol. 16 no. 8)
pp. 807-821

Different checkpointing strategies are combined with recovery models of different refinement levels in the database systems. The complexity of the resulting model increases with its accuracy in representing a realistic system. Three different analytic approaches are used depending on the complexity of the model: analytic, numerical and simulation. A Markovian queuing model is developed, resulting in a combined Poisson and load-dependent checkpointing strategy with stochastic recovery. A state-space analysis approach is used to derive semianalytic expressions for the performance variables in terms of a set of unknown boundary state probabilities. An efficient numerical algorithm for evaluating unknown probabilities is outlined. The validity of the numerical solution is checked against simulation results and shown to be of acceptable accuracy, particularly in the stable operating range. Simulations have shown that realistic load-dependent checkpointing results in performance close to the optimal deterministic checkpointing. Furthermore, the stochastic recovery model is an accurate representation of a realistic recovery.

[1] B. M. Aladzhev and V. M. Kokotov, "Optical interval between checkpoints in a program,"Automation Remote Contr., vol. 40, no. 10, pp. 1531-1536, 1979; translated fromAvtomatika i Telemekhanika, no. 10, pp. 157-164, 1979.
[2] F. Baccelli, "Analysis of a service facility with periodic checkpointing,"Acta Inform., vol. 15, no. 1, pp. 67-81, 1981.
[3] F. Baccelli and T. Znati, "Queueing algorithms with breakdowns in database modeling," inPerformance '81, F. J. Kylstra, Ed. Amsterdam, The Netherlands: North-Holland, 1981, pp. 213-231.
[4] K. M. Chandy, "A survey of analytic models of rollback and recovery strategies,"Computer, vol. 8, no. 5, pp. 40-47, 1975.
[5] K. M. Chandy, J. C. Browne, C. W. Dissly, and W. R. Uhrig, "Analytic models for rollback and recovery strategies in database systems,"IEEE Trans. Software Eng., vol. SE-1, no. 1, pp. 100-110, 1975.
[6] A. Duda, "Performance analysis of the checkpoint-rollback-recovery system via diffusion approximation," inMathematical Computer Performance and Reliability, G. Iazeolla, P. J. Courtois, and A. Hordijk, Eds. Amsterdam, The Netherlands: North-Holland, 1984, pp. 315- 327.
[7] A. Duda, "The effects of checkpointing on program execution time,"Inform. Processing Lett., vol. 16, pp. 221-229, 1983.
[8] G. S. Fishman,Principles of Discrete Event Simulation. New York: Wiley, 1978.
[9] D. P. Gaver, "A waiting line with interrupted service, including priorities,"J. Roy. Statist. Soc., Series B-24, pp. 73-90, 1962.
[10] E. Gelenbe and D. Derochette, "Performance of rollback recovery systems under intermittent failures,"Commun. ACM, vol. 21, no. 6, pp. 493-499, 1978.
[11] E. Gelenbe, "On the optimum checkpoint interval,"J. ACM, vol. 26, no. 2, pp. 259-270, 1979.
[12] E. Gelenbe and I. Mitrani,Analysis and Synthesis of Computer Systems. London: Academic, 1980.
[13] U. Herzog, L. Woo, and K. M. Chandy, "Solution of queueing problems by a recursive technique,"IBM J. Res. Develop., pp. 295-300, May 1975.
[14] G. M. Lohman and J. A. Muckstadt, "Optimal policy for batch operations: Backup, checkpointing, reorganization and updating,"ACM Trans. Database Syst., vol. 2, no. 3, pp. 209-222, 1977.
[15] V. G. Kulkarni, V. F. Nicola, and K. S. Trivedi, "Effects of checkpointing and queueing on program performance," inCommunications in Statistics: Stochastic Models, to be published.
[16] N. Mikou and S. Tucci, "Analyse et optimisation d'une procedure de reprise dans un systeme de gestion de donnees centralisees,"Acta Inform., vol. 12, no. 4, pp. 321-338, 1979.
[17] V. F. Nicola and F. J. Kylstra, "A Markovian model, with state-dependent parameters, of a transactional system supported by checkpointing and recovery strategies," inMessung, Modellierung und Bewertung von Rechensystemen, P. J. Kuhn and K. M. Schulz, Eds. Berlin: Springer-Verlag, 1983, pp. 189-206.
[18] V. F. Nicola and F. J. Kylstra, "A model of checkpointing and recovery with a specified number of transactions between checkpoints," inPerformance '83, A. K. Agrawala and S. K. Tripathi, Eds. Amsterdam, The Netherlands: North-Holland, 1983, pp. 83- 100.
[19] V. F. Nicola, "A single server queue with mixed types of interruptions,"Acta Inform., vol. 23, pp. 465-486, 1986.
[20] A. N. Tantawi and M. Ruschitzka, "Performance analysis of checkpointing strategies,"ACM Trans. Comput. Syst., vol. 2, no. 2, pp. 123-144, 1984.
[21] S. Toueg and O. Babaoglu, "On the optimum checkpoint selection problem,"SIAM J. Comput., vol. 13, no. 3, pp. 630-649, 1984.
[22] K. S. Trivedi,Probability and Statistics with Reliability, Queueing and Computer Science Applications. Englewood Cliffs, NJ: Prentice-Hall, 1982.
[23] J. S. Verhofstad, "Recovery techniques for database systems,"Comput. Surveys, vol. 10, no. 2, pp. 167-195, 1978.
[24] J. W. Young, "A first order approximation to the optimum checkpoint interval,"Commun. ACM, vol. 17, no. 9, pp. 530-531, 1974.

Index Terms:
DBMS; checkpointing strategies; recovery models; refinement levels; realistic system; analytic approaches; simulation; Markovian queuing model; Poisson; load-dependent checkpointing strategy; stochastic recovery; state-space analysis approach; semianalytic expressions; performance variables; unknown boundary state probabilities; numerical algorithm; numerical solution; stable operating range; optimal deterministic checkpointing; computational complexity; database management systems; database theory; probability; queueing theory; system recovery.
V.F. Nicola, J.M. van Spanje, "Comparative Analysis of Different Models of Checkpointing and Recovery," IEEE Transactions on Software Engineering, vol. 16, no. 8, pp. 807-821, Aug. 1990, doi:10.1109/32.57620
Usage of this product signifies your acceptance of the Terms of Use.