This Article 
 Bibliographic References 
 Add to: 
Checkpointing for Distributed Databases: Starting from the Basics
September 1992 (vol. 3 no. 5)
pp. 602-610
Checkpointing in a distributed database system is analyzed by establishing a correspondence between consistent snapshots in a general distributed system and transaction-consistent checkpoints in a distributed database system. The analysis culminates in a useful condition for transaction-consistent checkpoints. Based on this condition, a general checkpointing scheme, which records a transaction-consistent set ofvalues of all or some selected data items is presented. These rules are implemented in some representative concurrency control protocols, i.e., those based on two-phase locking and timestamping. These implementations cause little interference with other activities in the database system.

[1] M. Ahamad and L. Lin, "Using checkpoints to localize the effects of faults in distributed systems," inProc. 8th Symp. Reliable Distributed Syst., Seattle, WA, Oct. 1989, pp. 2-11.
[2] R. Bayer, H. Heller, and A. Reiser, "Parallelism and recovery in database system,"ACM Trans. Database Syst., vol. 5, pp. 139-156, June 1980.
[3] P. Bernstein and N. Goodman, "Concurrency Control in Distributed Database Systems,"ACM Computing Surveys, Vol. 13, No. 2, June 1981, pp. 185-221.
[4] P.A. Bernstein, V. Hadzilacos, and N. Goodman,Concurrency Control and Recovery in Database Systems, Addison-Wesley, Reading, Mass., 1987.
[5] D. Briatico, A. Ciuffolett, and L. Simoncini, "A distributed dominoeffect free recovery algorithm," inProc. IEEE Symp. Reliability in Distributed Software and Database Syst., Silver Spring, MD, Oct. 1984, pp. 207-215.
[6] M. Chandy, J. C. Browne, C. W. Dissly, and W. R. Uhrig, "Analysis models for rollback and recovery strategies in data base systems,"IEEE Trans. Software Eng., vol. SE-1, no. 1, pp. 100-110, Mar. 1975.
[7] K. M. Chandy and L. Lamport, "Distributed snapshots: Determining global states of distributed systems,"ACM Trans. Comput. Syst., vol. 3, no. 1, pp. 63-75, Feb. 1985.
[8] K. M. Chandy and J. Misra,Parallel Program Design: A Foundation. Reading, MA: Addison-Wesley, 1988.
[9] P. Dadam and G. Schlageter, "Recovery in distributed databases based on nonsynchronized local checkpoints," inInformation Processing. Amsterdam, The Netherlands: North-Holland, 1980, pp. 457-462.
[10] E. W. Dijkstra, "The distributed snapshot of K. M. Chandy and L. Lamport," inControl Flow and Data Flow: Concepts of Distributed Programming, NATO ASI Series F: Computer and System Sciences, vol. 14, M. Broy, Ed. Berlin, Germany: Springer-Verlag, 1985, pp. 513-518.
[11] K. P. Eswaran, J. N. Gray, R. A. Lorie, and I. L. Traiger, "The notions of consistency and predicate locks in a database system,"Commun. ACM, vol. 19, no. 11, pp. 624-633, Nov. 1976.
[12] M. J. Fischer, N. D. Griffeth, and N. A. Lynch, "Global states of a distributed system,"IEEE Trans. Software Eng., vol. SE-8, no. 3, pp. 198-202, 1982.
[13] J. Gray, "Notes on database operation systems," inOperating Systems: An Advanced Course(Lecture Notes in Computer Science, vol. 60) Berlin: Springer-Verlag, 1978.
[14] R. Koo and S. Toueg, "Checkpointing and rollback-recovery for distributed systems,"IEEE Trans. Software Eng., vol. SE-13, pp. 23-31, Jan. 1987.
[15] H. T. Kung and J. T. Robinson, "On optimistic methods for concurrency control,"ACM Trans. Database Syst., vol. 6, pp. 213-226, June 1981.
[16] H. Kuss, "On totally ordering checkpoints in distributed databases," inProc. ACM SIGMOD, 1982, pp. 293-302.
[17] L. Lamport, "Time, clocks, and the ordering of events in a distributed system,"Commun. ACM, vol. 21, no. 7, pp. 558-565, July 1978.
[18] L. Lamport, "The mutual exclusion problem: Part I--A theory of interprocess communication,"J. ACM, vol. 33, no. 2, pp. 313-326, 1986.
[19] G. LeLann, "Distributed systems-- Toward a formal approach," inProc. IFIP Congress, Aug. 1977, pp. 155-160.
[20] P-J. Leu and B. Bhargava, "Concurrent robust checkpointing and recovery in distributed systems," inProc. ACM-SIGMOD Int. Conf. Management Data, 1988, pp. 154-163.
[21] J. McDermid, "Checkpointing and error recovery in distributed systems," inProc. 2nd Int. Conf. Distributed Comput. Syst.1981, pp. 271-282.
[22] S. Mullender and A. Tannenbaum, "A distributed file service based on optimistic concurrency control," inProc. 10th Symp. Operating System Principle, Orcas Island, WA, Dec. 1985, pp. 51-62.
[23] C. H. Papadimitriou,The Theory of Concurrency Control. Rockville, MD: Computer Science Press, 1986.
[24] S. Pilarski and T. Kameda, "A novel checkpointing scheme for distributed database systems," inProc. 9th ACM Symp. Principles Database Syst., Nashville, TN, Apr. 1990, pp. 368-378.
[25] C. Pu, "On-the-fly, incremental, consistent reading of entire databases,"Algorithmica, vol. 1, no. 3, pp. 271-287, Oct. 1986.
[26] C. Pu, C. H. Hong, and J. M. Wha, "Performance evaluation of global reading of entire databases," inProc. Int. Symp. Databases in Parallel and Distributed Syst., Austin, Dec. 1988, pp. 167-176.
[27] D. P. Reed, "Implementing atomic actions on decentralized data," inProc. 7th ACM Symp. Oper. Systems Principles, Dec. 1979, pp. 66-74.
[28] K. Salem and H. Garcia-Molina, "Checkpointing memory-resident databases," Tech. Rep. CS-TR-126-87, Dep. Comput. Sci., Princeton Univ., Dec. 1987.
[29] G. Schlageter and P. Dadam, "Reconstruction of consistent global states in distributed databases," inProc. Int. Symp. Distributed Databases, 1980, pp. 191-200.
[30] S. H. Son and A. K. Agrawala, "A nonintrusive checkpointing scheme in distributed database systems," inProc. IEEE FTCS-15, 1985, pp. 99-104.
[31] S. H. Son and A. K. Agrawala, "A distributed checkpointing for globally consistent states of databases,"IEEE Trans. Software Eng., vol. SE-15, no. 10, pp. 1157-1167, Oct. 1989.
[32] K. Venkatesh, T. Radhakrishnan, and H. F. Li, "Optimal checkpointing and local recording for domino free rollback recovery,"Inform. Processing Lett., vol. 25, pp. 295-303, 1987.

Index Terms:
Index Termsdistributed databases; consistent snapshots; transaction-consistent checkpoints;checkpointing; concurrency control protocols; two-phase locking; timestamping;concurrency control; database theory; distributed databases
S. Pilarski, T. Kameda, "Checkpointing for Distributed Databases: Starting from the Basics," IEEE Transactions on Parallel and Distributed Systems, vol. 3, no. 5, pp. 602-610, Sept. 1992, doi:10.1109/71.159043
Usage of this product signifies your acceptance of the Terms of Use.