This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Efficient Algorithms for Global Snapshots in Large Distributed Systems
May 2010 (vol. 21 no. 5)
pp. 620-630
Rahul Garg, IBM T.J. Watson Research Center, Yorktown Heights
Vijay K. Garg, University of Texas at Austin, Austin
Yogish Sabharwal, IBM India Research Laboratory, New Delhi
Existing algorithms for global snapshots in distributed systems are not scalable when the underlying topology is complete. There are primarily two classes of existing algorithms for computing a global snapshot. Algorithms in the first class use control messages of size O(1) but require O(N) space and O(N) messages per processor in a network with N processors. Algorithms in the second class use control messages (such as rotating tokens with vector counter method) of size O(N), use multiple control messages per channel, or require recording of message history. As a result, algorithms in both of these classes are not efficient in large systems when the logical topology of the communication layer such as MPI is complete. In this paper, we propose three scalable algorithms for global snapshots: a grid-based, a tree-based, and a centralized algorithm. The grid-based algorithm uses O(N) space but only O(\sqrt{N}) messages per processor each of size O(\sqrt{N}). The tree-based and centralized algorithms use only O(1) size messages. The tree-based algorithm requires O(1) space and O(\log N \log (W/N)) messages per processor where W is the total number of messages in transit. The centralized algorithm requires O(1) space and O(\log (W/N)) messages per processor. We also have a matching lower bound for this problem. We also present hybrid of centralized and tree-based algorithms that allow trade-off between the decentralization and the message complexity. Our algorithms have applications in checkpointing, detecting stable predicates, and implementing synchronizers.

[1] F. Mattern, "Efficient Algorithms for Distributed Snapshots and Global Virtual Time Approximation," J. Parallel and Distributed Computing, pp. 423-434, Aug. 1993.
[2] M. Schulz, G. Bronevetsky, R. Fernandes, D. Marques, K. Pingali, and P. Stodghill, "Implementation, Evaluation of a Scalable Application-Level Checkpoint-Recovery Scheme for MPI Programs," Proc. Supercomputing Conf. (SC '04), Nov. 2004.
[3] T.H. Lai and T.H. Yang, "On Distributed Snapshots," Information Processing Letters, vol. 25, no. 3, pp. 153-158, 1987.
[4] L. Lamport, "Time, Clocks, and the Ordering of Events in a Distributed System," Comm. ACM, vol. 21, no. 7, pp. 558-565, July 1978.
[5] K.M. Chandy and L. Lamport, "Distributed Snapshots: Determining Global States of Distributed Systems," ACM Trans. Computer Systems, vol. 3, no. 1, pp. 63-75, Feb. 1985.
[6] M. Spezialetti and P. Kearns, "Efficient Distributed Snapshots," Proc. Sixth Int'l Conf. Distributed Computing Systems, pp. 382-388, 1986.
[7] L. Bouge, "Repeated Snapshots in Distributed Systems with Synchronous Communication and Their Implementation in CSP," Theoretical Computer Science, vol. 49, pp. 145-169, 1987.
[8] A.D. Kshemkalyani, M. Raynal, and M. Singhal, "An Introduction to Snapshot Algorithms in Distributed Computing," Distributed Systems Eng., vol. 2, no. 4, pp. 224-233, http://stacks.iop.org/0967-1846/2224. Dec. 1995.
[9] J.-M. Hélary, A. Mostefaoui, R.H.B. Netzer, and M. Raynal, "Communication-Based Prevention of Useless Checkpoints in Distributed Computations," Distributed Computing, vol. 13, no. 1, pp. 29-43, 2000.
[10] Q. Jiang, Y. Luo, and D. Manivannan, "An Optimistic Checkpointing and Message Logging Approach for Consistent Global Checkpoint Collection in Distributed Systems," J. Parallel and Distributed Computing, vol. 68, no. 12, pp. 1575-1589, 2008.
[11] A. Kshemkalyani and B. Wu, "Detecting Arbitrary Stable Properties Using Efficient Snapshots," IEEE Trans. Software Eng., vol. 5, no. 33, pp. 330-346, May 2007.
[12] V.K. Garg, Concurrent and Distributed Computing in Java. Wiley & Sons, 2004.
[13] E.W. Dijkstra and C.S. Scholten, "Termination Detection for Diffusing Computations," Information Processing Letters, vol. 11, no. 4, pp. 1-4, Aug. 1980.
[14] K.M. Chandy and J. Misra, "How Processes Learn," Proc. Fourth Ann. ACM Symp. Principles of Distributed Computing, R. Strong, ed., pp. 204-214, Aug. 1985.
[15] B. Awerbuch, "Complexity of Network Synchronization," J. ACM, vol. 32, no. 4, pp. 804-823, Oct. 1985.
[16] R. Garg, V.K. Garg, and Y. Sabharwal, "Scalable Algorithms for Global Snapshots in Distributed Systems," Proc. 20th Ann. Int'l Conf. Supercomputing (ICS '06), pp. 269-277, 2006.

Index Terms:
Checkpointing, global snapshots, stable predicates.
Citation:
Rahul Garg, Vijay K. Garg, Yogish Sabharwal, "Efficient Algorithms for Global Snapshots in Large Distributed Systems," IEEE Transactions on Parallel and Distributed Systems, vol. 21, no. 5, pp. 620-630, May 2010, doi:10.1109/TPDS.2009.108
Usage of this product signifies your acceptance of the Terms of Use.