This Article 
 Bibliographic References 
 Add to: 
Fault-Tolerant Clock Synchronization in Large Multicomputer Systems
September 1994 (vol. 5 no. 9)
pp. 912-923

The cost of synchronizing a multicomputer increases with system size. For largemulticomputers, the time and resources spent to enable each node to estimate the clockvalue of every other node in the system can be prohibitive. We show how to reduce thecost of synchronization by assigning each node to one or more groups, then having eachnode estimate the clock values of only those nodes with which it shares a group. Sinceeach node estimates the clock value of only a subset of the nodes, the cost ofsynchronization can be significantly reduced. We also provide a method for computing the maximum skew between any two nodes in the multicomputer, and a method for computing the maximum time between synchronizations. We also show how the fault tolerance of the synchronization algorithm may be determined.

[1] K. Arvind, "A new probabilistic algorithm for clock synchronization," inProc. Real-Time Syst. Symp., Santa Monica, CA, 1989, pp. 330-339.
[2] R. E. Beehler and D. W. Allan, "Recent trends in NBS time and frequency distribution services,"Proc. IEEE, vol. 74, no. 1, pp. 155-157, Jan. 1986.
[3] F. Cristian, "Probabilistic clock synchronization,"Distrib. Computing, vol. 3, pp. 146-158, 1989.
[4] A. Goldberg and R. Tarjan, "A new approach to the maximum flow problem," inProc. 18th ACM Symp. Theory Comput., 1986, pp. 136-146.
[5] J.Y. Halpern et al., "Fault-Tolerant Clock Synchronization,"Proc. Third Ann. ACM Symp. Principles of Distributed Computing, ACM, New York, 1984, pp. 89-102.
[6] J. L. W. Kessels, "Two designs of a fault-tolerant clocking system,"IEEE Trans. Comput.vol. C-33, no. 10, pp. 912-919, Oct. 1984.
[7] C.M. Krishna, K.G. Shin, and R.W. Butler, "Ensuring Fault Tolerance of Phase-Locked Clocks,"IEEE Trans. Computers, Vol. C- 34, No. 8, Aug. 1985, pp. 752-756.
[8] L. Lamport and P.M. Melliar-Smith, "Synchronizing Clocks in the Presence of Faults,"J. ACM, Vol. 32, No. 1, Jan. 1985, pp. 52-78.
[9] J. Lundelius-Welch and N. Lynch, "A New Fault-Tolerant Algorithm for Clock Synchronization,"Information and Computation, Vol. 77, No. 1, 1988, pp. 1-36.
[10] A. Olson and K. G. Shin, "Probabilistic clock synchronization in large distributed systems," inProc. 11th Int. Conf. Distrib. Computing Syst., 1991, pp. 290-297.
[11] P. Ramanathan, D. D. Kandlur, and K. G. Shin, "Hardware-assisted software clock synchronization for homogeneous distributed systems,"IEEE Trans. Comput., vol. 39, pp. 514-524, Apr. 1990.
[12] S. Rangarajan and S. K. Tripathi, "Efficient synchronization of clocks in a distributed system," inProc. Real-Time Syst. Symp., 1991, pp. 22-31.
[13] K.G. Shin and P. Ramanathan, "Clock Synchronization of a Large Multiprocessor System in the Presence of Malicious Faults,"IEEE Trans. Computers, Vol. C-36, No. 1, Jan. 1987, pp. 2-12.
[14] T.K. Srikanth and S. Toueg, "Optimal Clock Synchronization,"J. ACM, Vol. 34, No. 3, July 1987, pp. 626-645.
[15] N. Vasanthavada and P. N. Marinos, "Synchronization of fault-tolerant clocks in the presence of malicious failures,"IEEE Trans. Comput., vol. 37, pp. 440-448, Apr. 1988.
[16] G. M. R. Winkler, "Changes at USNO in global timekeeping,"Proc. IEEE, vol. 74, no. 1, pp. 151-155, Jan. 1986.

Index Terms:
Index Termsmultiprocessing systems; synchronisation; clocks; fault tolerant computing; reliability; fault-tolerant clock synchronization; large multicomputer systems; clock value; maximum skew; maximum time; fault tolerance; synchronization algorithm; clock drift; clock skew
A. Olson, K.G. Shin, "Fault-Tolerant Clock Synchronization in Large Multicomputer Systems," IEEE Transactions on Parallel and Distributed Systems, vol. 5, no. 9, pp. 912-923, Sept. 1994, doi:10.1109/71.308530
Usage of this product signifies your acceptance of the Terms of Use.