This Article 
 Bibliographic References 
 Add to: 
A Real-Time Primary-Backup Replication Service
June 1999 (vol. 10 no. 6)
pp. 533-548

Abstract—This paper presents a real-time primary-backup replication scheme to support fault-tolerant data access in a real-time environment. The main features of the system are fast response to client requests, bounded inconsistency between primary and backup, temporal consistency guarantee for replicated data, and quick recovery from failures. The paper defines external and interobject temporal consistency, the notion of phase variance, and builds a computation model that ensures such consistencies for replicated data deterministically where the underlying communication mechanism provides deterministic message delivery semantics and probabilistically where no such support is available. It also presents an optimization of the system and an analysis of the failover process which includes failover consistency and failure recovery time. An implementation of the proposed scheme is built within the $x$-kernel architecture on the MK 7.2 microkernel from the Open Group. The results of a detailed performance evaluation of this implementation are also discussed.

[1] T. Abdelzaher, A. Shaikh, F. Jahanian, and K. Shin, “RTCAST: Lightweight Multicast for Real-Time Process Groups,” Proc. IEEE Real-Time Technology and Applications Symp., June 1996.
[2] R. Alonso, D. Barbara, and H. Garcia-Molina, "Data Caching Issues in an Information Retrieval System," ACM Trans. Database Systems, vol. 15, no. 3, pp. 359-384, Sept. 1990.
[3] P.A. Alsberg and J.D. Day,“A principle for resilient sharing of distributed resources,” Proc. Second Int’l Conf. Software Eng., pp. 562-570, Oct. 1976.
[4] P.A. Barrett et al., “The Delta-4 Extra Performance Architecture,” Proc. 20th Int'l Symp. Fault-Tolerant Computing (FTCS-20), pp. 481-488, 1990.
[5] J.F. Bartlett, “Tandem: A Non-Stop Kernel,” ACM Operating System Review, vol. 15, 1991.
[6] K. Birman, "The Process Group Approach to Reliable Distributed Computing," Comm. ACM, vol. 36, no. 12, pp. 37-53, 1993.
[7] N. Budhiraja and K. Marzullo, "Tradeoffs in Implementing Primary-Backup Protocols," Proc. IEEE Symp. Parallel and Distributed Processing, pp. 280-288, Oct. 1995.
[8] A. Burns and A. Wellings, “Response Time Analysis,” Real-Time Systems and Programming Languages, second ed., chapter 13, pp. 407-411. Addison-Wesley, 1997.
[9] R. Kazman, "Tool Support for Architectural Analysis and Design," Joint Proc. SIGSOFT '96 Workshops, ACM Press, New York, pp. 94-97.
[10] F. Cristian, B. Dancey, and J. Dehn, “Fault Tolerance in the Advanced Automation System,” Proc. 20th IEEE Int'l Symp. Fault-Tolerant Computing, p. 617, Newcastle, U.K., 1990.
[11] S.B. Davidson and A. Watters, “Partial Computation in Real-Time Database Systems,” Proc. Workshop Real-Time Operating Systems and Software, pp. 117-121, May 1998.
[12] M. Gagliardi, R. Rajkumar, and L. Sha, “Designing for Evolvability: Building Blocks for Evolvable Real-Time Systems,” Proc. Real-Time Technology and Applications Symp., June 1996.
[13] C.-C. Han and K.-J. Lin, “Scheduling Distance-Contrained Real-Time Tasks,” Proc. IEEE 13th Real-Time Systems Symp., pp. 300-308, Dec. 1992.
[14] N.C. Hutchinson and L.L. Peterson, “The x-Kernel: An Architecture for Implementing Network Protocols,” IEEE Trans. Software Eng., vol. 17, no. 1, pp. 64-76, Jan. 1991.
[15] B. Kao and H. Garcia-Molina, “An Overview of Real-Time Database Systems,” Advances in Real-Time Systems, S.H. Son, ed., pp. 463-486. Prentice Hall, 1995.
[16] H. Kopetz and G. Grünsteidl, "TTP: A Time-Triggered Protocol for Fault-Tolerant Real-Time Systems," Computer, vol. 24, no. 1, Jan. 1994, pp. 14-23.
[17] H. Kopetz and P. Verissimo, “Real-Time and Dependability Concepts,” Distributed Systems, S. Mullender, ed., second ed., chapter 16, pp. 411-446. Addison-Wesley, 1993.
[18] H.F. Korth, N. Soparkar, and A. Silberschatz, “Triggered Real Time Databases with Consistency Constraints,” Proc. 16th VLDB Conf., Aug. 1990.
[19] T.-W. Kuo and A.K. Mok, “SSP: A Semantics-Based Protocol for Real-Time Data Access,” Proc. IEEE 14th Real-Time Systems Symp., Dec. 1993.
[20] T.-W. Kuo and A.K. Mok, “Real-Time Database—Similarity, Semantics, and Resource Scheduling,” ACM SIGMOD Record, Mar. 1997.
[21] K.-J. Lin, “Consistency Issues in Real-Time Database Systems,” Proc. 22nd Hawaiin Int'l Conf. System Sciences, pp. 654-661, Jan. 1989.
[22] K.-J. Lin and F. Jahanian, “Issues and Application,” Real-Time Database Systems, S. Son, ed. Kluwer Academic, 1997.
[23] C.L. Liu and J.W. Layland, “Scheduling Algorithms for Multiprogramming in a Hard Real-Time Environment,” J. ACM, vol. 20, no. 1, pp. 40-61, 1973.
[24] J.W.S. Liu, W. Shih, K.J. Lin, R. Bettati, and J. Chung, “Imprecise Computations,” IEEE Proc., Jan. 1994.
[25] A. Mehra, J. Rexford, and F. Jahanian, “Design and Evaluation of a Window-Consistent Replication Service,” IEEE Trans. Computers, vol. 46, no. 9, pp. 986-996, Sept. 1997.
[26] L.E. Moser, P.M. Melliar-Smith, D.A. Agarwal, R.K. Budhia, and C.A. Lingley-Papadopoulos, “Totem: A Fault-Tolerant Multicast Group Communication System,” Comm. ACM, vol. 39, no. 4, pp. 54–63, 1996.
[27] S.W. O'Malley and L.L. Peterson, “A Dynamic Network Architecture,” ACM Trans. Computer Systems, vol. 10, no. 2, pp. 110-143, 1992.
[28] C. Pu and A. Leff, "Replica Control in Distributed Systems: An Asynchronous Approach," Proc. ACM SIGMOD Int'l Conf. Management Data, pp. 377-386, 1991.
[29] B. Purimetla, R.M. Sivasankaran, K. Ramamritham, and J.A. Stankovic, “Real-Time Databases: Issues and Applications,” Advances in Real-Time Systems, S.H. Son, ed., first ed., chapter 20. Prentice Hall, 1995.
[30] R. van Renesse, K.P. Birman, and S. Maffeis, “Horus: A Flexible Group Communication System,” Comm. ACM, vol. 39, no. 4, pp. 76–83, 1996.
[31] F.B. Schneider, "Implementing Fault-Tolerant Services Using the State Machine Approach: A Tutorial," ACM Computing Surveys, vol. 22, no. 4, pp. 299-319, Dec. 1990.
[32] X. Song and J.W.S. Liu, “Maintaining Temporal Consistency: Pessimistic versus Optimistic Concurrency Control,” Proc. IEEE Trans. Knowledge and Data Eng., vol. 7, no. 5, pp. 786-796, Oct. 1995.
[33] K. Tindell, A. Burns, and A. Wellings, “An Extendible Approach for Analyzing Fixed Priority Hard Real-Time Tasks,” The J. Real-Time Systems, vol. 6, pp. 133–151, Mar. 1994.
[34] M. Xiong, R.M. Sivasankaran, J. Stankovic, K. Ramamritham, and D. Towsley, “Scheduling Transactions with Temporal Constraints: Exploiting Data Semantics,” Proc. IEEE 17th Real-Time Systems Symp., pp. 240-251, Dec. 1996.
[35] H. Zou and F. Jahanian, “Optimization of a Real-Time Primary-Backup Replication Service,” Proc. IEEE Symp. Reliable Distributed Systems, pp. 177-185, Oct. 1998.
[36] H. Zou and F. Jahanian, “Real-Time Primary-Backup Replication with Temporal Consistency Guarantees,” Proc. IEEE Int'l Conf. Distributed Computing Systems, June 1998.

Index Terms:
Real-time systems, fault tolerance, replication protocols, temporal consistency, phase variance, probabilistic consistency.
Hengming Zou, Farnam Jahanian, "A Real-Time Primary-Backup Replication Service," IEEE Transactions on Parallel and Distributed Systems, vol. 10, no. 6, pp. 533-548, June 1999, doi:10.1109/71.774905
Usage of this product signifies your acceptance of the Terms of Use.