|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
| ASCII Text | x | ||
| G. Manimaran, C. Siva Ram Murthy, "A Fault-Tolerant Dynamic Scheduling Algorithm for Multiprocessor Real-Time Systems and Its Analysis," IEEE Transactions on Parallel and Distributed Systems, vol. 9, no. 11, pp. 1137-1152, November, 1998. | |||
| BibTex | x | ||
| @article{ 10.1109/71.735960, author = {G. Manimaran and C. Siva Ram Murthy}, title = {A Fault-Tolerant Dynamic Scheduling Algorithm for Multiprocessor Real-Time Systems and Its Analysis}, journal ={IEEE Transactions on Parallel and Distributed Systems}, volume = {9}, number = {11}, issn = {1045-9219}, year = {1998}, pages = {1137-1152}, doi = {http://doi.ieeecomputersociety.org/10.1109/71.735960}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - JOUR JO - IEEE Transactions on Parallel and Distributed Systems TI - A Fault-Tolerant Dynamic Scheduling Algorithm for Multiprocessor Real-Time Systems and Its Analysis IS - 11 SN - 1045-9219 SP1137 EP1152 EPD - 1137-1152 A1 - G. Manimaran, A1 - C. Siva Ram Murthy, PY - 1998 KW - Real-time system KW - dynamic scheduling KW - fault tolerance KW - resource reclaiming KW - run-time anomaly KW - safety critical application. VL - 9 JA - IEEE Transactions on Parallel and Distributed Systems ER - | |||
Abstract—Many time-critical applications require dynamic scheduling with predictable performance. Tasks corresponding to these applications have deadlines to be met despite the presence of faults. In this paper, we propose an algorithm to dynamically schedule arriving real-time tasks with resource and fault-tolerant requirements on to multiprocessor systems. The tasks are assumed to be nonpreemptable and each task has two copies (versions) which are mutually excluded in space, as well as in time in the schedule, to handle permanent processor failures and to obtain better performance, respectively. Our algorithm can tolerate more than one fault at a time, and employs performance improving techniques such as 1) distance concept which decides the relative position of the two copies of a task in the task queue, 2) flexible backup overloading, which introduces a trade-off between degree of fault tolerance and performance, and 3) resource reclaiming, which reclaims resources both from deallocated backups and early completing tasks. We quantify, through simulation studies, the effectiveness of each of these techniques in improving the guarantee ratio, which is defined as the percentage of total tasks, arrived in the system, whose deadlines are met. Also, we compare through simulation studies the performance our algorithm with a best known algorithm for the problem, and show analytically the importance of distance parameter in fault-tolerant dynamic scheduling in multiprocessor real-time systems.
[1] M.L. Dertouzos and A.K. Mok, Multiprocessor On-Line Scheduling of Hard-Real-Time Tasks IEEE Trans. Software Eng., vol. 15, no. 12, pp. 1497-1505, 1989.
[2] M.R. Garey and D.S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness.New York: W.H. Freeman, 1979.
[3] S. Ghosh, R. Melhem, and D. Mosse, "Fault-Tolerance Through Scheduling of Aperiodic Tasks in Hard Real-Time Multiprocessor Systems," IEEE Trans. Parallel and Distributed Systems, vol. 8, no. 3, pp. 272-284, Mar. 1997.
[4] R.L. Graham, "Bounds on Multiprocessing Timing Anomalies," SIAM J. Applied Math., vol. 17, no. 2, Mar. 1969.
[5] K. Kim and J. Yoon, "Approaches to Implementation of Reparable Distributed Recovery Block Scheme," Proc. IEEE Fault-Tolerant Computing Symp., pp. 50-55, 1988.
[6] H. Kopetz, A. Damm, C. Koza, M. Mulazzani, W. Schwabi, C. Senft, and R. Zainlinger, "Distributed Fault-Tolerant Real-Time Systems: The MARS Approach," IEEE Micro, pp. 25-58, Feb. 1989.
[7] C. M. Krishna and K. G. Shin,“On scheduling tasks with a quick recovery from failure,”IEEE Trans. Comput., vol. C-35, no. 5, pp. 448–455, May 1986.
[8] C.M. Krishna and K.G. Shin, Real-Time Systems. McGraw-Hill Int'l, 1997.
[9] J.H. Lala and R.E. Harper, "Architectural Principles for Safety-Critical Real-Time Applications," Proc. IEEE, vol. 82, no. 1, pp. 25-40, Jan. 1994.
[10] A. L. Liestman and R. H. Campbell,“A fault tolerant scheduling problem,”IEEE Trans. Software Eng., vol. SE-12, no. 11, pp. 1089–1095, Nov. 1986.
[11] J.W.S. Liu, W. Shih, K.J. Lin, R. Bettati, and J. Chung, “Imprecise Computations,” IEEE Proc., Jan. 1994.
[12] L.V. Mancini, "Modular Redundancy in a Message Passing System," IEEE Trans. Software Eng., vol. 12, no. 1, pp. 79-86, Jan. 1986.
[13] G. Manimaran and C. Siva Ram Murthy, "An Efficient Dynamic Scheduling Algorithm for Multiprocessor Real-Time Systems," IEEE Trans. Parallel and Distributed Systems, vol. 9, no. 3, pp. 312-319, Mar. 1998.
[14] G. Manimaran and C. Siva Ram Murthy, "A New Study for Fault-Tolerant Real-Time Dynamic Scheduling Algorithms," Proc. IEEE Int'l Conf. High Performance Computing, Dec. 1996.
[15] G. Manimaran, C.S.R. Murthy, M. Vijay, and K. Ramamritham, "New Algorithms for Resource Reclaiming from Precedence Constrained Tasks in Multiprocessor Real-Time Systems," to appear J. Parallel and Distributed Computing, vol. 44, no. 2, pp. 123-132, Aug. 1997.
[16] J.J. Molini, S.K. Maimon, and P.H. Watson, "Real-Time System Scenarios," Proc. IEEE Real-Time Systems Symp., pp. 214-225, 1990.
[17] D. Mossé, R. Melhem, and S. Ghosh, Analysis of a Fault-Tolerant Multiprocessor Scheduling Algorithm Proc. 24th Int'l Symp. Fault-Tolerant Computing, June 1994.
[18] E. Nett, H. Streich, P. Bizzarri, A. Bondavalli, and F. Tarini, "Adaptive Software Fault Tolerance Policies With Dynamic Real-Time Guarantees," Proc. WORDS '96, Feb. 1996.
[19] Y. Oh and S. Son, "Multiprocessor Support for Real-Time Fault-Tolerant Scheduling," Proc. IEEE Workshop Architectural Aspects of Real-Time Systems, Dec. 1991.
[20] J.H. Purtilo and P. Jalote, "An Environment for Developing Fault-Tolerant Software," IEEE Trans. Software Eng., vol. 17, no. 2, pp. 153-159, Feb. 1991.
[21] K. Ramamritham, J. Stankovic, and P. Shiah, “Efficient Scheduling Algorithms for Real-Time Multiprocessor Systems,” IEEE Trans. Parallel and Distributed Systems, vol. 1, no. 2, Apr. 1990.
[22] K. Ramamritham and J.A. Stankovic, “Scheduling Algorithms and Operating System Support for Real Time Systems,” Proc. IEEE, vol. 82, no. 1, Jan. 1994.
[23] P. Ramanathan, “Graceful Degradation in Real-Time Control Applications Using (m, k)-Firm Guarantee,” Proc. IEEE 27th Int'l Symp. Fault-Tolerant Computing, June 1997.
[24] C. Shen, K. Ramamritham, and J.A. Stankovic, "Resource Reclaiming in Multiprocessor Real-Time Systems," IEEE Trans. Parallel and Distributed Systems, Vol. 4, No. 4, Apr. 1993, pp. 382-397.
[25] K.G. Shin and P. Ramanathan, "Real-Time Computing: A New Discipline of Computer Science and Engineering," Proc. IEEE, vol. 82, no. 1, Jan. 1994.
[26] A.K. Somani and N.H. Vaidya, "Understanding Fault-Tolerance and Reliability," Computer, vol. 30, no. 4, pp. 45-50, Apr. 1997.
[27] J. Stankovic and K. Ramamritham, “The Spring Kernel: A New Paradigm for Real-Time Operating Systems,” ACM Operating Systems Review, vol. 23, no. 3, pp. 54–71, July 1989.
[28] H. Streich, "TaskPair-Scheduling: An Approach for Dynamic Real-Time Systems," Int'l J. Mini and Microcomputers, vol. 17, no. 2, pp. 77-83, Jan. 1995.
[29] S. Tridandapani, A. Somani, and U. Sandadi, "Low Overhead Multiprocessor Allocation Strategies Exploiting System Spare Capacity for Fault Detection and Location," IEEE Trans. Computers, vol. 44, no. 7, pp. 865-877, July, 1995.
[30] T. Tsuchiya, Y. Kakuda, and T. Kikuno, "Fault-Tolerant Scheduling Algorithm for Distributed Real-Time Systems," Proc. Workshop Parallel and Distributed Real-time Systems, 1995.
[31] F. Wang, K. Ramamritham, and J.A. Stankovic, "Determining Redundancy Levels for Fault Tolerant Real-Time Systems," IEEE Trans. Computers, vol. 44, no. 2, pp. 292-301, Feb. 1995.
[32] J. Xu, “Multiprocessor Scheduling of Processes with Release Times, Deadlines, Precedence, and Exclusion Relations,” IEEE Trans. Software Eng., vol. 19, no. 2, pp. 139-154, Feb. 1993.
[33] W. Zhao, K. Ramamritham, and J.A. Stankovic, “Scheduling Tasks with Resource Requirements in Hard Real Time Systems,” IEEE Trans. Software Eng., vol. 13, no. 5, pp. 564-577, May 1987.

