This Article 
 Bibliographic References 
 Add to: 
Dependability of COTS Microkernel-Based Systems
February 2002 (vol. 51 no. 2)
pp. 138-163

The commercial offer concerning microkernel technology constitutes an attractive alternative for developing operating systems to suit a wide range of application domains. However, the integration of COTS microkernels into critical embedded computer systems is a problem for system developers, in particular due to the lack of objective data concerning their behavior in the presence of faults. This paper addresses this issue by describing a prototype environment (MAFALDA: Microkernel Assessment by Fault injection AnaLysis and Design Aid) that is aimed at providing objective failure data on a candidate microkernel and also improving its error detection capabilities. The paper first presents the overall architecture of MAFALDA. Then, a case study carried out on an instance of the Chorus microkernel is used to illustrate the benefits that can be obtained with MAFALDA both from the dependability assessment and design-aid viewpoints. Implementation issues are also addressed that account for the specific API of the target microkernel. Some overall insights and lessons learned, gained during the various studies conducted on both Chorus and another target microkernel (LynxOS), are then depicted and discussed. Finally, we conclude the paper by summarizing the main features of the work presented and by identifying future research.

[1] J.C. Fabre et al., "Assessment of COTS Microkernels by Fault Injection," Proc. 7th IFIP Working Conf. Dependable Computing for Critical Applications (DCCA-7), IEEE Press, 1999, pp. 25-44.
[2] F. Salles, M. Rodríguez, J.-C. Fabre, and J. Arlat, “Metakernels and Fault Containment Wrappers,” Proc. 29th IEEE Int'l Symp. Fault-Tolerant Computing (FTCS-29), pp. 22-29, 1999.
[3] M. Rodríguez, F. Salles, J.-C. Fabre, and J. Arlat, “MAFALDA: Microkernel Assessment by Fault Injection and Design Aid,” Proc. Third European Dependable Computing Conf. (EDCC-3), pp. 143-160, 1999.
[4] J.-C. Fabre, M. Rodriguez, J. Arlat, and J.-M. Sizun, Building Dependable Cots Microkernel-Based Systems Using Mafalda Proc. 2000 Pacific Rim Int'l Symp. Dependable Computing (PRDC '00), pp. 85-94, Dec. 2000.
[5] OSE Real Time Kernel, OSE Systems Inc. (ENEA group), Täby, Sweden, 1997 (see also:http:/
[6] Functional Safety of Electrical/Electronic/Programmable Electronic Safety-Related Systems, Part 3: Software Requirements, Int'l Electrotechnical Commission (IEC), Standard Document no. 61508-3, first ed., 1998.
[7] H. Kantz and C. Koza, “The ELEKTRA Railway Signalling-System: Field Experience with an Actively Replicated System with Diversity,” Proc. 25th Int'l Symp. Fault-Tolerant Computing (FTCS-25), pp. 453-458, 1995.
[8] “Chorus/ClassiX r3—Technical Overview,” Chorus Systems, Technical Report no. CS/TR-96-119.8, 1996.
[9] LynxOS Real-Time Operating System, LynuxWorks (formally Lynx RTS), 2000.
[10] VxWorks Realtime Kernel, WindRiver Systems, 1998.
[11] J.-C. Laprie, “Dependable Computing: Concepts, Limits, Challenges,” Proc. 25th Int'l Symp. Fault-Tolerant Computing (FTCS-25), Special Issue, pp. 42-54, 1995.
[12] D. Powell, G. Bonn, D. Seaton, P. Verissimo, and F. Waeselynck, The Delta-4 Approach to Dependability in Open Distributed Computing Systems Proc. 18th IEEE Int'l Symp. Fault-Tolerant Computing (FTCS-18), pp. 246-251, June 1988.
[13] D. Powell et al., “GUARDS: A Generic Upgradable Architecture for Real-Time Dependable Systems,” IEEE Trans. Parallel and Distributed Systems, vol. 10, no. 6, pp. 580-599, June 1999.
[14] J. Arlat et al., "Fault Injection for Dependability Validation: A Methodology and Some Applications," IEEE Trans. Software Eng., Feb. 1990, pp. 166-182.
[15] M. Hsueh, T. Tsai, and R. Iyer, “Fault Injection Techniques and Tools,” Computer, pp. 75–82, Apr. 1997.
[16] J.V. Carreira, D. Costa, and J.G. Silva, Fault Injection Spot-Checks Computer System Dependability IEEE Spectrum, vol. 36, pp. 50-55, Aug. 1999.
[17] G.A. Kanawati, N.A. Kanawati, and J.A. Abraham, FERRARI: A Flexible Software-Based Fault and Error Injection System IEEE Trans. Computers, vol. 44, no. 2, pp. 248-260, Feb. 1995.
[18] M. Rimen, J. Ohlsson, and J. Torin, "On Microprocessor Error Behavior Modeling," Proc. 24th Int'l Symp. Fault-Tolerant Computing FTCS-24,Austin, Texas, pp. 76-85, 1994.
[19] D.R. Avresky, S.J. Geoghegan, and P.K. Tapadiya, “A Software-Based Fault Injection Tool,” Int'l J. Computer Systems Science and Eng., vol. 13, no. 6, pp. 125-135, Nov. 1998.
[20] E. Fuchs, “Validating the Fail-Silence of the MARS Architecture,” Dependable Computing for Critical Applications (Proc. Sixth IFIP Int'l Working Conf. Dependable Computing for Critical Applications (DCCA-6)), M. Dal Cin, C. Meadows, and W.H. Sanders, eds., pp. 225-247, 1998.
[21] Z. Kalbarczyk, G. Ries, M.S. Lee, Y. Xiao, J. Patel, and R.K. Iyer, “Hierarchical Approach to Accurate Fault Modeling for System Evaluation,” Proc. Int'l Computer Performance and Dependability Symp. (IPDS '98), pp. 249-258, 1998.
[22] R. Johansson, On Single Event Upset Error Manifestation Proc. First European Dependable Computing Conf. (EDCC-1), pp. 217-231, 1994.
[23] H. Madeira, D. Costa, and M. Vieira, On the Emulation of Software Faults by Software Fault Injection Proc. Int'l Conf. Dependable Systems and Networks (DSN-2000), pp. 417-426, 2000.
[24] J. Carreira, H. Madeira, and J.G. Silva, Xception: A Technique for the Experimental Evaluation of Dependability in Modern Computers IEEE Trans. Software Eng., vol. 24, no. 2, pp. 125-136, Feb. 1998.
[25] P. Koopman and J. DeVale, Comparing the Robustness of POSIX Operating Systems Proc. 29th Int'l Symp. Fault-Tolerant Computing (FTCS-29), pp. 30-37, 1999.
[26] W. Kao, R. Iyer, and D. Tang, "FINE: A Fault Injection and Monitoring Environment for Tracing the UNIX System Behavior Under Faults," IEEE Trans. Software Eng., vol. 19, no. 11, pp. 1,105-1,118, Nov. 1993.
[27] W.-L. Kao and R. Iyer, “DEFINE: A Distributed Fault Injection and Monitoring Environment,” Fault-Tolerant Parallel and Distributed Systems, D.K. Pradhan and D.R. Avresky, eds., pp. 252-259, Los Alamitos, Calif.: IEEE CS Press, 1995.
[28] W. Cheswick and S. Bellovin, Firewalls and Internet Security. Reading, Mass.: Addison-Wesley, 1994.
[29] J.M. Voas, “Certifying Off-the-Shelf Software Components,” Computer, vol. 31, no. 6, 1998.
[30] J.-M. Ayache, P. Azéma, and M. Diaz, “Observer: A Concept for Detection of Control Errors in Concurrent Systems,” Proc. Ninth Int'l Symp. Fault-Tolerant Computing (FTCS-9), pp. 79-85, 1979.
[31] A. Mahmood, D.M. Andrews, and E.J. McCluskey, “Executable Assertions and Flight Software,” Proc. Sixth Digital Avionics Systems Conf., pp. 346-351, 1984.
[32] C. Rabéjac, J.-P. Blanquart, and J.-P. Queille, “Executable Assertions and Timed Traces for On-Line Software Error Detection,” Proc. 26th Int'l Symp. Fault-Tolerant Computing (FTCS-26), pp. 138-147, 1996.
[33] M. Hiller, “Executable Assertions for Detecting Data Errors in Embedded Control Systems,” Proc. Int'l Conf. Dependable Systems and Networks (DSN 2000), pp. 24-33, June 2000.
[34] P. Maes, "Concepts and Experiments in Computational Reflection," Proc. OOPSLA '87, pp. 147-155,Orlando, Fla., 1987.
[35] F. Salles, “Dependability of Microkernel-Based Operating Systems: Failure Mode Analysis and Error Confinement,” doctorate dissertation, Paul Sabatier Univ., Toulouse, France, 1999.
[36] M. Rozier et al., “Overview of the CHORUS Distributed Operating Systems,” Technical Report no. CS/TR-90-25.1, Chorus Systems, 1991.
[37] A. S. Tanenbaum,Operating Systems: Design and Implementation. Englewood Cliffs, NJ: Prentice-Hall, 1987.
[38] N. Audsley and A. Wellings, “Analysing APEX Applications,” Proc. Int'l Real-Time Systems Symp. (RTSS '96), pp. 39-44, 1996.
[39] A. Burns, R. Davis, and S. Punnekkat, “Feasibility Analysis of Fault-Tolerant Real-Time Task Sets,” Proc. Euromicro Workshop Real-Time Systems, pp. 29-33, 1996.
[40] J. Lehoczky, “Real-Time Queueing Network Theory,” Proc. Int'l Real-Time Systems Symp. (RTSS '97), pp. 220-229, 1997.
[41] B. Dutertre, “Formal Analysis of the Priority Ceiling Protocol,” Proc. 21st Real-Time Systems Symp. (RTSS 2000), pp. 151-160, 2000.
[42] M. Rodríguez, J.-C. Fabre, and J. Arlat, “Formal Specification for Building Robust Real-time Microkernels,” Proc. 21st Real-Time Systems Symp. (RTSS 2000), pp. 119-128, 2000.
[43] M. Daran and P. Thévenod-Fosse, Software Error Analysis: A Real Case Study Involving Real Faults and Mutations Proc. Int'l Symp. Software Testing and Analysis (ISSTA '96), pp. 158-171, 1996.
[44] J. Arlat, J. Boué, and Y. Crouzet, Validation-Based Development of Dependable Systems IEEE Micro, vol. 19, no. 4, pp. 66-79, July/Aug. 1999.
[45] M. Rodríguez, J.-C. Fabre, and J. Arlat, “Dependability Assessment of Real-Time Systems,” Research Report no. 01-189, LAAS-CNRS, May 2001.
[46] D. Avresky, J. Arlat, J.C. Laprie, and Y. Crouzet, Fault Injection for Formal Testing of Fault Tolerance IEEE Trans. Reliability, vol. 45, no. 3, pp. 443-455, Sept. 1996.
[47] A. Arazo and Y. Crouzet, “Formal Guides for Experimentally Verifying Complex Software-Implements Fault Tolerance Mechanisms,” Proc. Seventh Int'l Conf. Eng. of Complex Computer Systems (ICECCS 2001), pp. 69-79, 2001.

Index Terms:
COTS microkernels, dependability characterization, fault injection, error confinement, wrapping.
J. Arlat, J.-C. Fabre, M. Rodríguez, F. Salles, "Dependability of COTS Microkernel-Based Systems," IEEE Transactions on Computers, vol. 51, no. 2, pp. 138-163, Feb. 2002, doi:10.1109/12.980005
Usage of this product signifies your acceptance of the Terms of Use.