This Article 
 Bibliographic References 
 Add to: 
Computing Reliability Intervals for k-Resilient Protocols
March 1995 (vol. 44 no. 3)
pp. 462-466

Abstractk-resilient protocols are used in some parallel and distributed system applications for increased availability of resources. A protocol running on an n site system is k resilient if it could tolerate up to k failures and operate correctly. The reliability of such a protocol is defined as the probability that no more than k sites have failed. Such a k-resilient protocol is beneficial only when its reliability is greater than the reliability of a protocol running on a system with a single site. We consider k-resilient protocols and develop a general technique for approximately computing the time until which these protocols have higher reliability than protocols running on single site systems. We call this time the reliability interval. Our general techniques for computing the reliability interval can be used irrespective of the type of failure distribution (with respect to time) of the sites of the system. We use experimental results to validate our technique.

[1] P.A. Alsberg and J.D. Day,“A principle for resilient sharing of distributed resources,” Proc. Second Int’l Conf. Software Eng., pp. 562-570, Oct. 1976.
[2] T. Hagerup and C. Rüb, "A Guided Tour of Chernoff Bounds," Information Processing Letters, vol. 33, pp. 305-308, 1989/90.
[3] A. Kumar, “Hierarchical Quorum Consensus: A New Algorithm for Managing Replicated Data,” IEEE Trans. Computers, vol. 40, no. 9, pp. 996-1,004, Sept. 1991.
[4] L. Lamport, R. Shostak, and M. Pease, "The Byzantine Generals Problem," ACM Trans. Programming Languages and Systems, vol. 4, no. 3, July 1982, pp. 382-401.
[5] S. Rangarajan,S. Setia,, and S.K. Tripathi,“A fault-tolerant algorithm for replicated data management,” Proc. Eighth Int’l Conf. Data Eng., pp. 230-237, Feb. 1992.
[6] D. Skeen,“A quorum based commit protocol,” Proc. Sixth Berkeley Workshop Distributed Data Management and Computer Networks, pp. 69-80, Feb. 1982.
[7] Y.C. Tay,“Byzantine agreement is short-lived: The reliability of k-resilient distributed protocols,” Technical Report TR-18-83, Center for Research in Computing Technology, Harvard Univ., 1983.
[8] R.H. Thomas, “A Majority Consensus Approach to Concurrency Control,” ACM Trans. Database Systems, vol. 4, no. 2, pp. 180-209, June 1979.
[9] K.S. Trivedi, Probability and Statistics with Reliability, Queuing, and Computer Science Applications. Prentice Hall, 1982.

Index Terms:
k-resilient protocols, reliability, reliability interval, scalable mission time.
Yennun Huang, Sampath Rangarajan, Satish K. Tripathi, "Computing Reliability Intervals for k-Resilient Protocols," IEEE Transactions on Computers, vol. 44, no. 3, pp. 462-466, March 1995, doi:10.1109/12.372039
Usage of this product signifies your acceptance of the Terms of Use.