The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.09 - September (2010 vol.21)
pp: 1290-1303
Antonio Fernández Anta , LADyR, GSyC, Universidad Rey Juan Carlos, Móstoles
Michel Raynal , IRISA, Université de Rennes, Rennes
ABSTRACT
Considering an asynchronous system made up of n processes and where up to t of them can crash, finding the weakest assumption that such a system has to satisfy for a common leader to be eventually elected is one of the holy grail quests of fault-tolerant asynchronous computing. This paper is a step in that direction. It has two contributions. Considering a simple and general asynchronous system model where processes generate asynchronous pulses during which they send and receive messages, it first introduces an additional assumption that allows to elect an eventual leader in all the runs that satisfy that assumption. That assumption is captured by the notion of asynchronous intermittent rotating t-star. An x-star is made up of one process p (the center of the star) plus a sequence of sets of x processes (the successive points of the star), which satisfies some properties. Intuitively, the intermittent rotating t-star assumption means that there are a process p, a subset of pulse numbers pn, and associated sets of processes Q(pn) such that each process of Q(pn) receives from p a message sent in pulse pn in a timely manner or among the first (n-t) messages tagged pn it ever receives. The t-star is called rotating because the set Q(pn) is allowed to change with pn; it is intermittent because it can disappear during finite periods; it is asynchronous because the points of a star are not required to be simultaneously at the same pulse. (This assumption combines and generalizes several synchrony and time-free assumptions that have been previously proposed to elect an eventual leader, e.g., eventual t-source, eventual t-moving source, and message pattern assumption.) The second contribution of the paper is an algorithm that eventually elects a common leader in the systems that satisfy the asynchronous intermittent rotating t-star assumption. This algorithm enjoys, among others, two noteworthy properties. First, from a design point of view, it is simple. Second, from a cost point of view, only the pulse numbers increase without bound. This means that, even in infinite executions, be links timely or not (or have the corresponding sender crashed or not), all the other local variables (including the timers) and message fields have a finite domain.
INDEX TERMS
Assumption coverage, asynchronous system, distributed algorithm, eventual t-source, eventual leader, failure detector, fault tolerance, message pattern, moving source, omega, partial synchrony, process crash, system model, timely link.
CITATION
Antonio Fernández Anta, Michel Raynal, "From an Asynchronous Intermittent Rotating Star to an Eventual Leader", IEEE Transactions on Parallel & Distributed Systems, vol.21, no. 9, pp. 1290-1303, September 2010, doi:10.1109/TPDS.2009.163
REFERENCES
[1] M.K. Aguilera, C. Delporte-Gallet, H. Fauconnier, and S. Toueg, "On Implementing Omega in Systems with Weak Reliability and Synchrony Assumptions," Distributed Computing, vol. 21, no. 4, pp. 285-314, 2008.
[2] M.K. Aguilera, C. Delporte-Gallet, H. Fauconnier, and S. Toueg, "Communication Efficient Leader Election and Consensus with Limited Link Synchrony," Proc. 23rd ACM Symp. Principles of Distributed Computing (PODC '04), pp. 328-337, 2004.
[3] H. Attiya and J. Welch, Distributed Computing, Fundamentals, Simulation and Advanced Topics, second ed., p. 414. John Wiley & Sons, 2004.
[4] T.D. Chandra and S. Toueg, "Unreliable Failure Detectors for Reliable Distributed Systems," J. ACM, vol. 43, no. 2, pp. 225-267, 1996.
[5] T.D. Chandra, V. Hadzilacos, and S. Toueg, "The Weakest Failure Detector for Solving Consensus," J. ACM, vol. 43, no. 4, pp. 685-722, 1996.
[6] B. Charron and A. Schiper, "Harmful Dogmas in Fault-Tolerant Distributed Computing," ACM SIGACT News, Distributed Computing Column, vol. 38, no. 1, pp. 53-61, 2007.
[7] C. Delporte-Gallet, S. Devismes, and H. Fauconnier, "Robust Stabilizing Leader Election," Proc. Ninth Int'l Symp. Stabilization, Safety, and Security of Distributed Systems (SSS '07), pp. 219-233, 2007.
[8] T.E. Elrad and N. Francez, "Decomposition of Distributed Programs into Communication-Closed Layers," Science of Computer Programming, vol. 2, no. 3, pp. 155-173, 1982.
[9] A. Fernández, E. Jiménez, and M. Raynal, "Eventual Leader Election with Weak Assumptions on Initial Knowledge, Communication Reliability, and Synchrony," Proc. Int'l IEEE Conf. Dependable Systems and Networks (DSN '06), pp. 166-175, 2006.
[10] A. Fernández, E. Jiménez, G. Trédan, and M. Raynal, "A Timing Assumption and Two $t$ -Resilient Protocols for Implementing an Eventual Leader Service in Asynchronous Shared Memory Systems," to be published in Algorithmica, DOI 10.1007/s00453-008-9190-2.
[11] A. Fernández and M. Raynal, "From an Intermittent Rotating Star to a Leader," Proc. 11th Int'l Conf. Principles of Distributed Systems (OPODIS '07), pp. 189-203, 2007.
[12] M.J. Fischer, N. Lynch, and M.S. Paterson, "Impossibility of Distributed Consensus with One Faulty Process," J. ACM, vol. 32, no. 2, pp. 374-382, 1985.
[13] E. Gafni, "Round-by-Round Fault Detectors: Unifying Synchrony and Asynchrony," Proc. 17th ACM Symp. Principles of Distributed Computing (PODC '00), pp. 143-152, 1998.
[14] R. Guerraoui, "Indulgent Algorithms," Proc. 19th ACM Symp. Principles of Distributed Computing (PODC '00), pp. 289-298, 2000.
[15] R. Guerraoui and M. Raynal, "The Information Structure of Indulgent Consensus," IEEE Trans. Computers, vol. 53, no. 4, pp. 453-466, Apr. 2004.
[16] J.-M. Hélary, A. Mostéfaoui, and M. Raynal, "Interval Consistency of Asynchronous Distributed Computations," J. Computer and System Sciences, vol. 64, no. 2, pp. 329-349, 2002.
[17] M. Hutle, D. Malkhi, U. Schmid, and L. Zhou, "Chasing the Weakest System Model for Implementing $\Omega$ and Consensus," Brief Announcement, Proc. Eighth Int'l Symp. Stabilization, Safety and Security in Distributed Systems 2006 (SSS '06), pp. 576-577, 2009.
[18] E. Jiménez, S. Arévalo, and A. Fernández, "Implementing Unreliable Failure Detectors with Unknown Membership," Information Processing Letters, vol. 100, no. 2, pp. 60-63, 2006.
[19] I. Keidar and A. Shraer, "How to Choose a Timing Model," IEEE Trans. Parallel Distributed Systems, vol. 19, no. 10, pp. 1367-1380, Oct. 2008.
[20] L. Lamport, "The Part-Time Parliament," ACM Trans. Computer Systems, vol. 16, no. 2, pp. 133-169, 1998.
[21] L. Lamport, R. Shostak, and L. Pease, "The Byzantine General Problem," ACM Trans. Programming Languages and Systems, vol. 4, no. 3, pp. 382-401, 1982.
[22] M. Larrea, A. Fernández, and S. Arévalo, "Optimal Implementation of the Weakest Failure Detector for Solving Consensus," Proc. 19th IEEE Int'l Symp. Reliable Distributed Systems (SRDS '00), pp. 52-60, 2000.
[23] N.A. Lynch, Distributed Algorithms, p. 872. Morgan Kaufmann Publishers, Inc., 1996.
[24] D. Malkhi, F. Oprea, and L. Zhou, "$\Omega$ Meets Paxos: Leader Election and Stability without Eventual Timely Links," Proc. 19th Int'l Symp. Distributed Computing (DISC '05), pp. 199-213, 2005.
[25] A. Mostéfaoui, E. Mourgaya, and M. Raynal, "Asynchronous Implementation of Failure Detectors," Proc. Int'l IEEE Conf. Dependable Systems and Networks, pp. 351-360, 2003.
[26] A. Mostéfaoui and M. Raynal, "Leader Based Consensus," Parallel Processing Letters, vol. 11, no. 1, pp. 95-107, 2000.
[27] A. Mostéfaoui, M. Raynal, and C. Travers, "Crash-Resilient Time-Free Eventual Leadership," Proc. 23rd Int'l IEEE Symp. Reliable Distributed Systems, pp. 208-217, 2004.
[28] A. Mostéfaoui, M. Raynal, and C. Travers, "Time-Free and Timer-Based Assumptions Can Be Combined to Get Eventual Leadership," IEEE Trans. Parallel and Distributed Systems, vol. 17, no. 7, pp. 656-666, July 2006.
[29] D. Powell, "Failure Mode Assumptions and Assumption Coverage," Proc. 22nd Int'l Symp. Fault-Tolerant Computing (FTCS-22), pp. 386-395, 1992.
[30] M. Raynal, "A Short Introduction to Failure Detectors for Asynchronous Distributed Systems," ACM SIGACT News, Distributed Computing Column, vol. 36, no. 1, pp. 53-70, 2005.
32 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool