|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
| ASCII Text | x | ||
| W. Chen, S. Toueg, M.K. Aguilera, "On the Quality of Service of Failure Detectors," IEEE Transactions on Computers, vol. 51, no. 5, pp. 561-580, May, 2002. | |||
| BibTex | x | ||
| @article{ 10.1109/TC.2002.1004595, author = {W. Chen and S. Toueg and M.K. Aguilera}, title = {On the Quality of Service of Failure Detectors}, journal ={IEEE Transactions on Computers}, volume = {51}, number = {5}, issn = {0018-9340}, year = {2002}, pages = {561-580}, doi = {http://doi.ieeecomputersociety.org/10.1109/TC.2002.1004595}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - JOUR JO - IEEE Transactions on Computers TI - On the Quality of Service of Failure Detectors IS - 5 SN - 0018-9340 SP561 EP580 EPD - 561-580 A1 - W. Chen, A1 - S. Toueg, A1 - M.K. Aguilera, PY - 2002 KW - Failure detectors KW - quality of service KW - fault tolerance KW - distributed algorithm KW - probabilistic analysis VL - 51 JA - IEEE Transactions on Computers ER - | |||
We study the
[1] M.K. Aguilera, W. Chen, and S. Toueg, “Using the Heartbeat Failure Detector for Quiescent Reliable Communication and Consensus in Partitionable Networks,” Theoretical Computer Science, vol. 220, no. 1, pp. 3-30, June 1999.
[2] M.K. Aguilera, W. Chen, and S. Toueg, “Failure Detection and Consensus in the Crash-Recovery Model,” Distributed Computing, vol. 13, no. 2, pp. 99-125, Apr. 2000.
[3] M.K. Aguilera, W. Chen, and S. Toueg, “On Quiescent Reliable Communication,” SIAM J. Computing, vol. 29, no. 6, pp. 2040-2073, Apr. 2000.
[4] A. O. Allen,Probability, Statistics, and Queueing Theory with Computer Science Applications.New York: Academic, 1978.
[5] Y. Amir et al., Transis:“A Communication Subsystem for High Availability,” Proc. Int’l Symp. Fault‐Tolerant Computing, IEEE CS Press, Los Alamitos, Calif., 1992, pp. 76‐84.
[6] K. Arvind, “Probabilistic Clock Synchronization in Distributed Systems,” IEEE Trans. Parallel and Distributed Systems, vol. 5, no. 5, pp. 475-487, May 1994.
[7] O. Babaoglu, R. Davoli, L.-A. Giachini, and M.G. Baker, “Relacs: A Communications Infrastructure for Constructing Reliable Applications in Large-Scale Distributed Systems,” BROADCAST Project deliverable report, Dept. of Computing Science, Univ. of Newcastle upon Tyne, U.K., 1994.
[8] P. Billingsley, Probability and Measure, third ed. John Wiley&Sons, 1995.
[9] Reliable Distributed Computing with the Isis Toolkit, K.P. Birman and R. van Renesse, eds. IEEE CS Press, 1993.
[10] Requirements for Internet Hosts-Communication Layers, R. Braden, ed., RFC 1122, Oct. 1989.
[11] T.D. Chandra, V. Hadzillacos, S. Toueg, and B. Charron-Bost, “On the Impossibility of Group Membership,” Proc. 15th ACM Symp. Principles of Distributed Computing, pp. 322–330, 1996.
[12] T.D. Chandra and S. Toueg, “Unreliable Failure Detectors for Reliable Distributed Systems,” J. ACM, vol. 43, no. 2, pp. 225–267, 1996.
[13] W. Chen, “On the Quality of Service of Failure Detectors,” PhD thesis, Cornell Univ., May 2000, available at.
[14] F. Cristian, “Probabilistic Clock Synchronization,” Distributed Computing, vol. 3, no. 3, pp. 146-158, 1989.
[15] B. Deianov and S. Toueg, “Failure Detector Service for Dependable Computing (Fast Abstract),” Proc. 2000 Int'l Conf. Dependable Systems and Networks, pp. B14-B15, June 2000.
[16] D. Dolev, R. Friedman, I. Keidar, and D. Malkhi, “Failure Detectors in Omission Failure Environments,” Technical Report 96-1608, Dept. of Computer Science, Cornell Univ., Ithaca, N.Y., Sept. 1996.
[17] C. Fetzer and F. Cristian, “Fail-Aware Failure Detectors,” Proc. 15th Symp. Reliable Distributed Systems, pp. 200-209, Oct. 1996.
[18] C. Fetzer and F. Cristian, “A Fail-Aware Datagram Service,” Proc. Second Ann. Workshop Fault-Tolerant Parallel and Distributed Systems, Apr. 1997.
[19] C. Fetzer and F. Cristian, “A Fail-Aware Membership Service,” Proc. 16th Symp. Reliable Distributed Systems, pp. 157-164, Oct. 1997.
[20] M.G. Gouda and T.M. McGuire, “Accelerated Heartbeat Protocols,” Proc. 18th Int'l Conf. Distributed Computing Systems, May 1998.
[21] R. Guerraoui, M. Larrea, and A. Schiper, Non-Blocking Atomic Commitment with an Unreliable Failure Detector Proc. 14th Symp. Reliable Distributed Systems (SRDS '95), pp. 41-51, Sept. 1995.
[22] M.G. Hayden, “The Ensemble System,” PhD thesis, Cornell Univ., 1998.
[23] L.E. Moser, P.M. Melliar-Smith, D.A. Agarwal, R.K. Budhia, and C.A. Lingley-Papadopoulos, “Totem: A Fault-Tolerant Multicast Group Communication System,” Comm. ACM, vol. 39, no. 4, pp. 54–63, 1996.
[24] G.F. Pfister, In Search of Clusters, second ed. New Jersey: Prentice Hall, 1998.
[25] M. Raynal and F. Tronel, “Group Membership Failure Detection: A Simple Protocol and Its Probabilistic Analysis,” Distributed Systems Eng. J., vol. 6, no. 3, pp. 95-102, 1999.
[26] S.M. Ross, Stochastic Processes. John Wiley&Sons, 1983.
[27] K. Sigman, Stationary Marked Point Processes, an Intuitive Approach. Chapman&Hall, 1995.
[28] S. Toueg and D. Ivan, private communication, May 2001.
[29] R. van Renesse, K.P. Birman, and S. Maffeis, “Horus: A Flexible Group Communication System,” Comm. ACM, vol. 39, no. 4, pp. 76–83, 1996.
[30] R. van Renesse, Y. Minsky, and M. Hayden, “A Gossip-Style Failure Detection Service,” Proc. Middleware '98, Sept. 1998.
[31] P. Veríssimo and M. Raynal, “Time in Distributed System Models and Algorithms,” Advances in Distributed Systems: Advanced Distributed Computing from Algorithms to Systems, S. Krakowiak and S. K. Shrivastava, eds., chapter 1, Springer-Verlag, 2000.

