Issue No. 09 - September (1994 vol. 5)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/71.308530
<p>The cost of synchronizing a multicomputer increases with system size. For largemulticomputers, the time and resources spent to enable each node to estimate the clockvalue of every other node in the system can be prohibitive. We show how to reduce thecost of synchronization by assigning each node to one or more groups, then having eachnode estimate the clock values of only those nodes with which it shares a group. Sinceeach node estimates the clock value of only a subset of the nodes, the cost ofsynchronization can be significantly reduced. We also provide a method for computing the maximum skew between any two nodes in the multicomputer, and a method for computing the maximum time between synchronizations. We also show how the fault tolerance of the synchronization algorithm may be determined.</p>
Index Termsmultiprocessing systems; synchronisation; clocks; fault tolerant computing; reliability; fault-tolerant clock synchronization; large multicomputer systems; clock value; maximum skew; maximum time; fault tolerance; synchronization algorithm; clock drift; clock skew
A. Olson and K. Shin, "Fault-Tolerant Clock Synchronization in Large Multicomputer Systems," in IEEE Transactions on Parallel & Distributed Systems, vol. 5, no. , pp. 912-923, 1994.