This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Localizing Failures in Distributed Synchronization
July 1996 (vol. 7 no. 7)
pp. 705-716

Abstract—The fault-tolerance of distributed algorithms is investigated in asynchronous message passing systems with undetectable process failures. Two specific synchronization problems are considered, the dining philosophers problem and the binary committee coordination problem. The abstraction of a bounded doorway is introduced as a general mechanism for achieving individual progress and good failure locality. Using it as a building block, optimal fault-tolerant algorithms are constructed for the two problems.

[1] G.N. Buckley and A. Silberschatz, "An Effective Implementation for the Generalized Input-Output Construct of CSP," ACM Trans. Programming Languages and Systems, vol. 5, no. 2, pp. 223-235, Apr. 1982.
[2] U. S. Department of Defense, Reference Manual for the Ada Programming Language, Springer-Verlag, New York, 1983.
[3] M. Choy and A.K. Singh, "Efficient Fault-Tolerant Algorithms for Distributed Resource Allocation," ACM Trans. Programming Languages and Systems, vol. 17, no. 3, pp. 535-559, May 1995.
[4] L. Lamport, "On Interprocess Communication, Parts I and II," Distributed Computing, vol. 1, no. 2, pp. 77-101, 1986.
[5] B. Awerbuch, A.B. Goldberg, M. Luby, and S.A. Plotkin, "Network Decomposition and Locality in Distributed Computation," Proc. 30th Ann. Symp. Foundations of Computer Science, pp. 364-369, 1989.
[6] N. Linial, "Distributive Algorithms-Global Solutions From Local Data," Proc. 28th Ann. Symp. Foundations of Computer Science, pp. 331-335, 1987.
[7] P.A. Sistla, "Distributed Algorithms for Ensuring Fair Interprocess Communications," Proc. Third Ann. ACM Symp. Principles of Distributed Computing, pp. 266-277, 1984.
[8] E. Styer and G.L. Peterson, "Improved Algorithms for Distributed Resource Allocation," Proc. Seventh Ann. ACM Symp. Principles of Distributed Computing, pp. 105-116, Aug. 1988.
[9] R.M. Karp and A. Wigderson, "A Fast Parallel Algorithm for the Maximal Independent Set Problem," Proc. 16th Ann. ACM Symp. Theory of Computing, pp. 331-335, 1987.
[10] M. Luby, "A Simple Parallel Algorithm for the Maximal Independent Set Problem," SIAM J. Computing, vol. 15, pp. 1,036-1,052, 1986.
[11] A.V. Goldberg and S.A. Plotkin, "Parallel (δ+ 1) Coloring of Constant-Degree Graphs," Information Processing Letters, vol. 25, pp. 241-245, 1987.
[12] E.W. Dijkstra, "Hierarchical Ordering of Sequential Processes," Acta Information, vol. 1, pp. 115-138, 1971.
[13] N.A. Lynch, "Fast Allocation of Nearby Resources in a Distributed System," Proc. 12th Ann. ACM Symp. Theory of Computing, pp. 70-81, 1980.
[14] K.M. Chandy and J. Misra, Parallel Program Design—A Foundation. Addison-Wesley, 1988.
[15] B. Awerbuch and M. Saks, "A Dining Philosophers Algorithm with Polynomial Response Time," Proc. 31st Ann. IEEE Symp. Foundations of Computer Science, pp. 65-74, 1990.
[16] D. Kumar, "An Implementation of N-Party Synchronization Using Tokens," Proc. 10th Int'l Conf. Distributed Computing Systems, pp. 320-327,Paris, May 28- June1, 1990.
[17] S. Ramesh, "A New and Efficient Implementation of Multiprocess Synchronization," Proc. First Conf. Parallel Architectures and Languages Europe, Springer-Verlag Lecture Notes in Computer Science, vol. 259, pp. 387-401.Berlin: Springer-Verlag, 1987.
[18] M.J. Fischer, N.A. Lynch, J.E. Burns, and A. Borodin, "Resource Allocation with Immunity to Limited Process Failure," Proc. 20th Ann, IEEE Symp. Foundations of Computer Science, pp. 234-254, 1979.
[19] Z. Manna and A. Pnueli, "How to Cook a Temporal Proof for Your Pet Language," Proc. 10th ACM POPL, 1983.
[20] M. Choy and A.K. Singh, "Efficient Implementation of Synchronous Communication Over Asynchronous Networks," J. Parallel and Distributed Computing, vol. 26, pp. 166-180, 1995.
[21] Y. Tsay and R. Bagrodia, "Some Impossibility Results in Interprocess Synchronization," Distributed Computing, vol. 6, no. 4, pp. 221-231, 1993.
[22] M.J. Fischer, N.A. Lynch, and M.S. Paterson, “Impossibility of Distributed Consensus with One Faulty Process,” J. ACM, vol. 32, no. 2, pp. 374i–382, 1985.

Index Terms:
Concurrency, distributed algorithms, fault-tolerance, lower bounds, synchronization.
Citation:
Manhoi Choy, Ambuj K. Singh, "Localizing Failures in Distributed Synchronization," IEEE Transactions on Parallel and Distributed Systems, vol. 7, no. 7, pp. 705-716, July 1996, doi:10.1109/71.508250
Usage of this product signifies your acceptance of the Terms of Use.