This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Fast Asynchronous Uniform Consensus in Real-Time Distributed Systems
August 2002 (vol. 51 no. 8)
pp. 931-944

We investigate whether asynchronous computational models and asynchronous algorithms can be considered for designing real-time distributed fault-tolerant systems. A priori, the lack of bounded finite delays is antagonistic with timeliness requirements. We show how to circumvent this apparent contradiction, via the principle of "late binding" of a solution to some (partially) synchronous model. This principle is shown to maximize the coverage of demonstrated safety, liveness, and timeliness properties. These general results are illustrated with the Uniform Consensus (UC) and the Real-Time UC problems, assuming processor crashes and reliable communications, considering asynchronous solutions based upon Unreliable Failure Detectors. We introduce the concept of Fast Failure Detectors and we show that the problem of building Strong or Perfect Fast Failure Detectors in real systems can be stated as a distributed message scheduling problem. A generic solution to this problem is given, illustrated considering deterministic Ethernets. In passing, it is shown that, with our construction of Unreliable Failure Detectors, asynchronous algorithms that solve UC have a worst-case termination lower bound that matches the optimal synchronous lower bound, that is, (t+1)D, where t is the maximum number of processors that may crash and D is the maximum interprocess message delay. Finally, we introduce FastUC, a novel solution to UC, that is based upon Fast Failure Detectors. FastUC has a worst-case termination time that is sublinear in tD. For most practical cases and common values of t, FastUC terminates in D, making it a worst-case time optimal solution to Real-Time UC.

[1] T.D. Chandra and S. Toueg, “Unreliable Failure Detectors for Reliable Distributed Systems,” J. ACM, vol. 43, no. 2, pp. 225-267, Mar. 1996. (A preliminary version appeared in Proc. 10th ACM Symp. Principles of Distributed Computing, pp. 325-340, 1991 ).
[2] F. Cristian and C. Fetzer, “The Timed Asynchronous Distributed System Model,” IEEE Trans. Parallel and Distributed Systems, vol. 10, no. 6, pp. 642-657, June 1999.
[3] D. Dolev, C. Dwork, and L. Stockmeyer, “On the Minimal Syncrhony Needed for Distributed Consensus,” J. ACM, vol. 34, no. 1, pp. 77–97, Jan. 1987.
[4] C. Dwork, N. Lynch, and L. Stockmeyer, “Consensus in the Presence of Partial Synchrony,” J. ACM. vol. 35, no. 2, pp. 288–323, Apr. 1988.
[5] D. Ferrari and D.C. Verma,“A scheme for real-time channel establishment in wide-area networks, IEEE J. Selected Areas in Comm., vol. 8, no. 3, pp. 368-379, Apr. 1990.
[6] M.J. Fischer and N.A. Lynch, “A Lower Bound for the Time to Assure Interactive Consistency,” Information Processing Letters, vol. 14, pp. 183-186, June 1982.
[7] M.J. Fischer, N.A. Lynch, and M.S. Paterson, “Impossibility of Distributed Consensus with One Faulty Process,” J. ACM, vol. 32, no. 2, pp. 374i–382, 1985.
[8] R. Guerraoui, “Indulgent Algorithms,” Proc. 19th ACM Symp. Principles of Distributed Computing, pp. 289-297, July 2000.
[9] R. Guerraoui and A. Schiper, “Consensus: The Big Misunderstanding,” Proc. Sixth IEEE Workshop Future Trends in Distributed Computing, pp. 183–188, Tunis, Tunisia, Oct. 1997.
[10] J.-F. Hermant and G. LeLann, A Protocol and Correctness Proofs for Real-Time High-Performance Broadcast Networks Proc. IEEE Conf. Distributed Computing Systems, pp. 360-369, 1998.
[11] J.-F. Hermant, “Quelques Problèmes et Solutions en Ordonnancement Temps Réel pour Systèmes Répartis,” PhD thesis, Paris-VI-Pierre-et-Marie-Curie Univ., Sept. 1999.
[12] M. Hurfin and M. Raynal, “Asynchronous Protocols to Meet Real-Time Constraints: Is It Really Sensible? How to Proceed?” Proc. IEEE Int'l Symp. Object-Oriented Real-Time Distributed Computing, pp. 290-297, Apr. 1998.
[13] Algorithm derived independently in 1997 by P. Jayanti and S. Toueg, and by B. Charron-Bost (S. Toueg, private comm., 1999).
[14] J.F. Kurose, M. Schwartz, and Y. Yemini, "Multiple-Access Protocols and Time-Constrained Communication," ACM Computing Surveys, vol. 16, pp. 43-70, 1984.
[15] M. Larrea, S. Arévalo, and A. Fernández, “Efficient Algorithms to Implement Unreliable Failure Detectors in Partially Synchronous Systems,” Proc. 13th Int'l Symp. Distributed Computing, pp. 34-48, Sept. 1999.
[16] G. Le Lann, “On Real-Time and Non Real-Time Distributed Computing,” Proc. Ninth Int'l Workshop Distributed Algorithms, invited paper, Lecture Notes in Computer Science, vol. 972, pp. 51-70, Springer-Verlag, Sept. 1995.
[17] G. Le Lann, “Proof-Based System Engineering and Embedded Systems,” Proc. European School on Embedded Systems, invited paper, Lecture Notes in Computer Science, vol. 1494, pp. 208-248, Springer-Verlag, Nov. 1996.
[18] G. Le Lann, “Is 'Asynchronous Real-Time' an Oxymoron?” 15th Int'l Symp. Distributed Computing, invited talk, Oct. 2001, INRIA Research Report, to appear.
[19] G. Le Lann and P. Rolin, “Process and Device for the Transmission of Messages between Different Stations through a Local Distribution Network,” US Patent Number 4,847,835, July 1989, French Patent Number 84-16957, Nov. 1984.
[20] C.L. Liu and J.W. Layland, “Scheduling Algorithms for Multiprogramming in a Hard Real-Time Environment,” J. ACM, vol. 20, no. 1, pp. 40-61, 1973.
[21] N. Lynch, Distributed Algorithms. New Jersey, Morgan Kaufman, 1996.
[22] A. Mostefaoui and M. Raynal, “Consensus Based on Failure Detectors with a Perpetual Weak Accuracy Property,” Proc. IEEE Int'l Parallel and Distributed Processing Symp., pp. 514-519, May 2000.
[23] K. Tindell, A. Burns, and A. Wellings, “Analysis of Hard Real-Time Communication,” J. Real-Time Systems, vol. 9, pp. 147-171, Sept. 1995.
[24] P. Veríssimo, A. Casimiro, and C. Fetzer, “The Timely Computing Base: Timely Actions in the Presence of Uncertain Timeliness,” Proc. Int'l Conf. Dependable Systems and Networks, pp. 533-542, June 2000.
[25] H. Zhang, “Service Disciplines for Guaranteed Performance Service in Packet-Switching Networks,” Proc. IEEE, vol. 83, pp. 1374-1396, Oct. 1995.

Index Terms:
Asynchronous computational models, partially synchronous computational models, coverage, uniform consensus, real-time distributed fault-tolerant computing, safety, liveness, timeliness, unreliable failure detectors, schedulability analysis.
Citation:
Jean-François Hermant, Gérard Le Lann, "Fast Asynchronous Uniform Consensus in Real-Time Distributed Systems," IEEE Transactions on Computers, vol. 51, no. 8, pp. 931-944, Aug. 2002, doi:10.1109/TC.2002.1024740
Usage of this product signifies your acceptance of the Terms of Use.