This Article 
 Bibliographic References 
 Add to: 
Fault-Tolerant Rate-Monotonic First-Fit Scheduling in Hard-Real-Time Systems
September 1999 (vol. 10 no. 9)
pp. 934-945

Abstract—Hard-real-time systems require predictable performance despite the occurrence of failures. In this paper, fault tolerance is implemented by using a novel duplication technique where each task scheduled on a processor has either an active backup copy or a passive backup copy scheduled on a different processor. An active copy is always executed, while a passive copy is executed only in the case of a failure. First, the paper considers the ability of the widely-used Rate-Monotonic scheduling algorithm to meet the deadlines of periodic tasks in the presence of a processor failure. In particular, the Completion Time Test is extended so as to check the schedulability on a single processor of a task set including backup copies. Then, the paper extends the well-known Rate-Monotonic First-Fit assignment algorithm, where all the task copies, included the backup copies, are considered by Rate-Monotonic priority order and assigned to the first processor in which they fit. The proposed algorithm determines which tasks must use the active duplication and which can use the passive duplication. Passive duplication is preferred whenever possible, so as to overbook each processor with many passive copies whose primary copies are assigned to different processors. Moreover, the space allocated to active copies is reclaimed as soon as a failure is detected. Passive copy overbooking and active copy deallocation allow many passive copies to be scheduled sharing the same time intervals on the same processor, thus reducing the total number of processors needed. Simulation studies reveal a remarkable saving of processors with respect to those needed by the usual active duplication approach in which the schedule of the non-fault-tolerant case is duplicated on two sets of processors.

[1] A. Burchard, J. Liebeherr, Y. Oh, and S.H. Son, “Assigning Real-Time Tasks to Homogeneous Multiprocessor Systems,” IEEE Trans. Computers, vol. 44, no. 12, pp. 1429-1442, Dec. 1995.
[2] Computer and Job/Shop Scheduling Theory, E.G. Coffman Jr. ed., New York: John Wiley&Sons, 1976.
[3] S.K. Dhall and C.L. Liu, “On a Real-Time Scheduling Problem,” Operations Research, vol. 26, pp. 127-140, 1978.
[4] S. Ghosh, R. Melhem, and D. Mosse, "Fault-Tolerance Through Scheduling of Aperiodic Tasks in Hard Real-Time Multiprocessor Systems," IEEE Trans. Parallel and Distributed Systems, vol. 8, no. 3, pp. 272-284, Mar. 1997.
[5] M. Joseph and P. Pandya, “Finding Response Times in a Real-Time System,” The Computer J., vol. 29, pp. 390-395, Oct. 1986.
[6] K.H. Kim, “Distributed Execution of Recovery Block: An Approach to Uniform Treatment of Hardware and Software Faults,” Proc. Fourth IEEE Int'l Conf. Distributed Computing Systems, pp. 526-532, San Francisco, Calif., May 1984.
[7] C. M. Krishna and K. G. Shin,“On scheduling tasks with a quick recovery from failure,”IEEE Trans. Comput., vol. C-35, no. 5, pp. 448–455, May 1986.
[8] M. Klein,J. Lehoczky,, and R. Rajkumar,“Rate-monotonic analysis for real-time industrial computing,” Computer, pp. 24-33, January 1994.
[9] A. L. Liestman and R. H. Campbell,“A fault tolerant scheduling problem,”IEEE Trans. Software Eng., vol. SE-12, no. 11, pp. 1089–1095, Nov. 1986.
[10] C.L. Liu and J.W. Layland, “Scheduling Algorithms for Multiprogramming in a Hard Real-Time Environment,” J. ACM, vol. 20, no. 1, pp. 40-61, 1973.
[11] E.L. Lawler, J.K. Lenstra, A.H.G. Rinnooy Kan, and H. Shmoys, “Sequencing and Scheduling: Algorithms and Complexity,” Handbooks in Operations Research and Management Science, vol. 4, Logistic of Production and Inventory. , Amsterdam: North Holland, 1993.
[12] J.Y.-T. Leung and M.L. Merrill, “A Note on Preemptive Scheduling Periodic Real-Time Tasks,” Information Processing Letters, vol. 11, pp. 115-118, 1980.
[13] S. Ramos-Thuel and J.K. Strosnider, “Scheduling Fault Recovery Operations for Time-Critical Applications,” Proc. Fourth Int'l Conf. Dependable Computing for Critical Applications, pp. 270-282, Jan. 1994.
[14] J.A. Stankovic, “Decentralized Decision Making for Tasks Reallocation in a Hard Real-Time System,” IEEE Trans. Computers, vol. 38, no. 3, pp. 341-355, Mar. 1989.

Index Terms:
Fault tolerance, hard-real-time systems, multiprocessor systems, periodic tasks, rate-monotonic scheduling, task replication.
Alan A. Bertossi, Luigi V. Mancini, Federico Rossini, "Fault-Tolerant Rate-Monotonic First-Fit Scheduling in Hard-Real-Time Systems," IEEE Transactions on Parallel and Distributed Systems, vol. 10, no. 9, pp. 934-945, Sept. 1999, doi:10.1109/71.798317
Usage of this product signifies your acceptance of the Terms of Use.