Subscribe
Issue No.04 - October-December (2009 vol.6)
pp: 269-281
Martin Hutle , EPFL Vienna University of Technology, Vienna
Dahlia Malkhi , Microsoft Research
Ulrich Schmid , Vienna University of Technology, Vienna
Lidong Zhou , Microsoft Research
ABSTRACT
Aguilera et al. and Malkhi et al. presented two system models, which are weaker than all previously proposed models where the eventual leader election oracle Ω can be implemented, and thus, consensus can also be solved. The former model assumes unicast steps and at least one correct process with f outgoing eventually timely links, whereas the latter assumes broadcast steps and at least one correct process with f bidirectional but moving eventually timely links. Consequently, those models are incomparable. In this paper, we show that Ω can also be implemented in a system with at least one process with f outgoing moving eventually timely links, assuming either unicast or broadcast steps. It seems to be the weakest system model that allows to solve consensus via Ω-based algorithms known so far. We also provide matching lower bounds for the communication complexity of Ω in this model, which are based on an interesting “stabilization property” of infinite runs. Those results reveal a fairly high price to be paid for this further relaxation of synchrony properties.
INDEX TERMS
Distributed systems, failure detectors, fault-tolerant distributed consensus, system modeling, partial synchrony.
CITATION
Martin Hutle, Dahlia Malkhi, Ulrich Schmid, Lidong Zhou, "Chasing the Weakest System Model for Implementing Ω and Consensus", IEEE Transactions on Dependable and Secure Computing, vol.6, no. 4, pp. 269-281, October-December 2009, doi:10.1109/TDSC.2008.24
REFERENCES
 [1] M.K. Aguilera, C. Delporte-Gallet, H. Fauconnier, and S. Toueg, “On Implementing Omega with Weak Reliability and Synchrony Assumptions,” Proc. 22nd Ann. ACM Symp. Principles of Distributed Computing (PODC '03), pp. 306-314, 2003. [2] M.K. Aguilera, W. Chen, and S. Toueg, “Heartbeat: A Timeout-Free Failure Detector for Quiescent Reliable Communication,” Proc. 11th Int'l Workshop Distributed Algorithms (WDAG '97), pp.126-140, 1997. [3] M.K. Aguilera, W. Chen, and S. Toueg, “Using the Heartbeat Failure Detector for Quiescent Reliable Communication and Consensus in Partitionable Networks,” Theoretical Computer Science, vol. 220, no. 1, pp. 3-30, June 1999. [4] M.K. Aguilera, C. Delporte-Gallet, H. Fauconnier, and S. Toueg, “Stable Leader Election,” Proc. 15th Int'l Conf. Distributed Computing (DISC '01), pp. 108-122, 2001. [5] M.K. Aguilera, C. Delporte-Gallet, H. Fauconnier, and S. Toueg, “Communication-Efficient Leader Election and Consensus with Limited Link Synchrony,” Proc. 23rd Ann. ACM Symp. Principles of Distributed Computing (PODC '04), pp. 328-337, 2004. [6] E. Anceaume, A. Fernández, A. Mostéfaoui, G. Neiger, and M. Raynal, “A Necessary and Sufficient Condition for Transforming Limited Accuracy Failure Detectors,” J. Computer and System Sciences, vol. 68, no. 1, pp. 123-133, 2004. [7] H. Attiya, C. Dwork, N. Lynch, and L. Stockmeyer, “Bounds on the Time to Reach Agreement in the Presence of Timing Uncertainty,” J. ACM, vol. 41, no. 1, pp. 122-152, 1994. [8] A. Basu, B. Charron-Bost, and S. Toueg, “Crash failures versus Crash $+$ Link Failures,” Proc. 15th Ann. ACM Symp. Principles of Distributed Computing (PODC '96), p. 246, 1996. [9] T.D. Chandra, V. Hadzilacos, and S. Toueg, “The Weakest Failure Detector for Solving Consensus,” J. ACM, vol. 43, no. 4, pp. 685-722, June 1996. [10] T.D. Chandra and S. Toueg, “Unreliable Failure Detectors for Reliable Distributed Systems,” J. ACM, vol. 43, no. 2, pp. 225-267, Mar. 1996. [11] F.C. Chu, “Reducing $\Omega$ to $\diamond W$ ,” Information Processing Letters, vol. 67, no. 6, pp. 293-298, 1998. [12] R. Diestel, Graph Theory, third ed. Springer, 2006. [13] D. Dolev, C. Dwork, and L. Stockmeyer, “On the Minimal Synchronism Needed for Distributed Consensus,” J. ACM, vol. 34, no. 1, pp. 77-97, Jan. 1987. [14] C. Dwork, N. Lynch, and L. Stockmeyer, “Consensus in the Presence of Partial Synchrony,” J. ACM, vol. 35, no. 2, pp. 288-323, Apr. 1988. [15] C. Fetzer, U. Schmid, and M. Süßkraut, “On the Possibility of Consensus in Asynchronous Systems with Finite Average Response Times,” Proc. 25th Int'l Conf. Distributed Computing Systems (ICDCS '05), pp. 271-280, June 2005. [16] M.J. Fischer, N.A. Lynch, and M.S. Paterson, “Impossibility of Distributed Consensus with One Faulty Process,” J. ACM, vol. 32, no. 2, pp. 374-382, Apr. 1985. [17] R. Guerraoui and A. Schiper, ““$\Gamma\hbox{-}{\rm accurate}$ ” Failure Detectors,” Proc. 10th Int'l Workshop Distributed Algorithms (WDAG '96), Ö. Babaogˇlu, ed., vol. 1151, pp. 269-286, Oct. 1996. [18] M. Larrea, A. Fernández, and S. Arévalo, “Efficient Algorithms to Implement Unreliable Failure Detectors in Partially Synchronous Systems,” Proc. 13th Int'l Symp. Distributed Computing (DISC '99), pp. 34-48, Sept. 1999. [19] G. Le Lann and U. Schmid, “How to Implement a Timer-Free Perfect Failure Detector in Partially Synchronous Systems,” Technical Report 183/1-127, Dept. Automation, Technische Univ. Wien, Jan. 2003. [20] D. Malkhi, F. Oprea, and L. Zhou, “$\Omega$ Meets Paxos: Leader Election and Stability without Eventual Timely Links,” Proc. 19th Symp. Distributed Computing (DISC '05), vol. 3724, pp. 199-213, 2005. [21] A. Mostéfaoui and M. Raynal, “Solving Consensus Using Chandra-Toueg's Unreliable Failure Detectors: A General Quorum-Based Approach,” Proc. 13th Int'l Symp. Distributed Computing (DISC '99), P. Jayanti, ed., vol. 1693, pp. 49-63, Sept. 1999. [22] A. Mostéfaoui and M. Raynal, “Unreliable Failure Detectors with Limited Scope Accuracy and an Application to Consensus,” Proc. 19th Conf. Foundations of Software Technology and Theoretical Computer Science (FSTTCS '99), pp. 329-340, 1999. [23] A. Mostéfaoui and M. Raynal, “$k\hbox{-}{\rm Set}$ Agreement with Limited Accuracy Failure Detectors,” Proc. 19th Ann. ACM Symp. Principles of Distributed Computing (PODC '00), pp. 143-152, 2000. [24] A. Mostefaoui, E. Mourgaya, and M. Raynal, “Asynchronous Implementation of Failure Detectors,” Proc. Int'l Conf. Dependable Systems and Networks (DSN '03), June 2003. [25] S. Ponzio and R. Strong, “Semisynchrony and Real Time,” Proc. Sixth Int'l Workshop Distributed Algorithms (WDAG '92), pp. 120-135, Nov. 1992. [26] N. Santoro and P. Widmayer, “Time Is Not a Healer,” Proc. Sixth Ann. Symp. Theoretical Aspects of Computer Science (STACS'89), pp. 304-313, Feb. 1989. [27] U. Schmid, B. Weiss, and J. Rushby, “Formally Verified Byzantine Agreement in Presence of Link Faults,” Proc. 22nd Int'l Conf. Distributed Computing Systems (ICDCS '02), pp. 608-616, July 2002. [28] P.M.B. Vitányi, “Distributed Elections in an Archimedean Ring of Processors,” Proc. 16th Ann. ACM Symp. Theory of Computing (STOC '84), pp. 542-547, 1984. [29] J. Widder, G. Le Lann, and U. Schmid, “Failure Detection with Booting in Partially Synchronous Systems,” Proc. Fifth European Dependable Computing Conf. (EDCC '05), vol. 3463, pp.20-37, Apr. 2005. [30] J. Yang, G. Neiger, and E. Gafni, “Structured Derivations of Consensus Algorithms for Failure Detectors,” Proc. 17th Ann. ACM Symp. Principles of Distributed Computing (PODC '98), pp.297-308, 1998.