The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.05 - Sept.-Oct. (2012 vol.38)
pp: 1178-1194
Bev Littlewood , City University, London
John Rushby , SRI International, Menlo Park
ABSTRACT
This paper refines and extends an earlier one by the first author [1]. It considers the problem of reasoning about the reliability of fault-tolerant systems with two “channels” (i.e., components) of which one, A, because it is conventionally engineered and presumed to contain faults, supports only a claim of reliability, while the other, B, by virtue of extreme simplicity and extensive analysis, supports a plausible claim of “perfection.” We begin with the case where either channel can bring the system to a safe state. The reasoning about system probability of failure on demand ({pfd}) is divided into two steps. The first concerns aleatory uncertainty about 1) whether channel A will fail on a randomly selected demand and 2) whether channel B is imperfect. It is shown that, conditional upon knowing p_A (the probability that A fails on a randomly selected demand) and p_B (the probability that channel B is imperfect), a conservative bound on the probability that the system fails on a randomly selected demand is simply p_A \times p_B. That is, there is conditional independence between the events “A fails” and “B is imperfect.” The second step of the reasoning involves epistemic uncertainty, represented by assessors' beliefs about the distribution of (p_A, p_B), and it is here that dependence may arise. However, we show that under quite plausible assumptions, a conservative bound on system {pfd} can be constructed from point estimates for just three parameters. We discuss the feasibility of establishing credible estimates for these parameters. We extend our analysis from faults of omission to those of commission, and then combine these to yield an analysis for monitored architectures of a kind proposed for aircraft.
INDEX TERMS
Uncertainty, Software, Phase frequency detector, Cognition, Software reliability, Safety, software diversity, Software reliability, software fault tolerance, program correctness, assurance case
CITATION
Bev Littlewood, John Rushby, "Reasoning about the Reliability of Diverse Two-Channel Systems in Which One Channel Is "Possibly Perfect"", IEEE Transactions on Software Engineering, vol.38, no. 5, pp. 1178-1194, Sept.-Oct. 2012, doi:10.1109/TSE.2011.80
REFERENCES
[1] B. Littlewood, "The Use of Proof in Diversity Arguments," IEEE Trans. Software Eng., vol. 26, no. 10, pp. 1022-1023, Oct. 2000.
[2] J.C. Rouquet and P.J. Traverse, "Safe and Reliable Computing on Board the Airbus and ATR Aircraft," Proc. Int'l Workshop Safety of Computer Control Systems, July 1986.
[3] Statistical Summary of Commercial Jet Aircraft Accidents, Worldwide Operations, 1959-2009, Boeing Commercial Airplane Group, Seattle, Wash., July, Boeing Airplane Safety Eng., http://www.boeing.com/news/techissues/pdf statsum.pdf, 2010.
[4] In-Flight Upset 154 Km West of Learmonth, WA, 7 October 2008, VH-QPA Airbus A330-303, Australian Transport Safety Bureau, Aviation Occurrence Investigation AO-2008-070, Final, Dec. 2011.
[5] D.E. Eckhardt, A.K. Caglayan, J.C. Knight, L.D. Lee, D.F. McAllister, M.A. Vouk, and J.P.J. Kelly, "An Experimental Evaluation of Software Redundancy as a Strategy for Improving Reliability," IEEE Trans. Software Eng., vol. 17, no. 7, pp. 692-702, July 1991.
[6] J.C. Knight and N.G. Leveson, "An Experimental Evaluation of the Assumption of Independence in Multiversion Programming," IEEE Trans. Software Eng., vol. 12, no. 1, pp. 96-109, Jan. 1986.
[7] D.E. EckhardtJr. and L.D. Lee, "A Theoretical Basis for the Analysis of Multiversion Software Subject to Coincident Errors," IEEE Trans. Software Eng., vol. 11, no. 12, pp. 1511-1517, Dec. 1985.
[8] B. Littlewood and D.R. Miller, "Conceptual Modeling of Coincident Failures in Multiversion Software," IEEE Trans. Software Eng., vol. 15, no. 12, pp. 1596-1614, Dec. 1989.
[9] J.C. Knight and N.G. Leveson, "An Empirical Study of Failure Probabilities in Multi-Version Software," Proc. Fault Tolerant Computing Symp. 16, pp. 165-170, July 1986.
[10] B. Littlewood, P.T. Popov, L. Strigini, and N. Shryane, "Modeling the Effects of Combining Diverse Software Fault Detection Techniques," IEEE Trans. Software Eng., vol. 26, no. 12, pp. 1157-1167, Dec. 2000.
[11] Aerospace Recommended Practice (ARP) 4754: Certification Considerations for Highly-Integrated or Complex Aircraft Systems, Soc. of Automotive Eng., also issued as EUROCAE ED-79, Nov. 1996.
[12] System Design and Analysis, Federal Aviation Administration, Advisory Circular 25.1309-1A, June 1988.
[13] T. Kelly, "Arguing Safety—A Systematic Approach to Safety Case Management," PhD dissertation, Dept. of Computer Science, Univ. of York, U.K., 1998.
[14] P. Bishop and R. Bloomfield, "A Methodology for Safety Case Development," Proc. Safety-Critical Systems Symp., http://www. adelard.com/resources/papers/ pdfsss98web.pdf, Feb. 1998.
[15] W.L. Oberkampf and J.C. Helton, "Alternative Representations of Epistemic Uncertainty," Reliability Eng. and System Safety, vol. 85, nos. 1-3, pp. 1-10, 2004.
[16] A. O'Hagan, C.E. Buck, A. Daneshkhah, J.R. Eiser, P.H. Garthwaite, D.J. Jenkinson, J.E. Oakley, and T. Rakow, Uncertain Judgements: Eliciting Experts' Probabilities. Wiley, 2006.
[17] Safety Assessment Principles for Nuclear Facilities, 2006 ed., UK Health and Safety Executive, Bootle, U.K., http://www.hse. gov.uk/nuclear/sapssaps2006.pdf . 2006.
[18] Licensing of Safety Critical Software for Nuclear Reactors: Common Position of Seven European Nuclear Regulators and Authorised Technical Support Organizations, AVN Belgium, BfS Germany, CSN Spain, ISTec Germany, NII United Kingdom, SKI Sweden, STUK Finland, http://www.bfs.de/de/kerntechnik/sicherheit Licensing_safety_critical_software.pdf , 2007.
[19] Air Traffic Services Safety Requirements, CAP 670. UK Civil Aviation Authority, Safety Regulation Group, see Part B, Section 3, Systems Eng. SW01: Regulatory Objectives for Software Safety Assurance in ATS Equipment; http://www.caa.co.uk/docs/33cap670.pdf, June 2008.
[20] Defence Standard 00-56, Issue 4: Safety Management Requirements for Defence Systems. Part 1: Requirements, UK Ministry of Defence, http://www.dstan.mod.uk/data/00/05601000400.pdf , June 2007.
[21] Engineering Safety Management (The Yellow Book), Vol. 1 and 2, Fundamentals and Guidance, no. 4. Rail Safety and Standards Board, http://www.yellowbook-rail.org.uk/site/the_yellow_book the_yellow_book. html, 2007.
[22] DO-178B: Software Considerations in Airborne Systems and Equipment Certification, Requirements and Technical Concepts for Aviation, Washington, DC, this document is nnown as EUROCAE ED-12B in Europe, Dec. 1992.
[23] DO-297: Integrated Modular Avionics (IMA) Development Guidance and Certification Considerations, Requirements and Technical Concepts for Aviation, Washington, DC, also issued as EUROCAE ED-124, Nov. 2005-2007.
[24] Health and Safety at Work etc. Act. UK Health and Safety Executive, http://www.hse.gov.uk/legislationhswa.htm ; Guidance Suite http://www.hse.gov.uk/risk/theoryalarp.htm , 1974.
[25] The Use of Computers in Safety-Critical Applications: Final Report of the Study Group on the Safety of Operational Computer Systems. UK Health and Safety Commission, http://www.hse.gov.uk/ nuclearcomputers.pdf . 1998.
[26] B. Littlewood and L. Strigini, "Validation of Ultrahigh Dependability for Software-Based Systems," Comm. ACM, pp. 69-80, Nov. 1993.
[27] R.W. Butler and G.B. Finelli, "The Infeasibility of Experimental Quantification of Life-Critical Software Reliability," IEEE Trans. Software Eng., vol. 19, no. 1, pp. 3-12, Jan. 1993.
[28] J. May, G. Hughes, and A.D. Lunn, "Reliability Estimation from Appropriate Testing of Plant Protection Software," IEE/BCS Software Eng. J., vol. 10, no. 6, pp. 206-218, Nov. 1995.
[29] B. Littlewood and D. Wright, "The Use of Multi-Legged Arguments to Increase Confidence in Safety Claims for Software-Based Systems: A Study Based on a BBN Analysis of an Idealized Example," IEEE Trans. Software Eng., vol. 33, no. 5, pp. 347-365, May 2007.
[30] Development Guidelines for Vehicle Based Software. The Motor Industry Software Reliability Assoc. (MISRA), Jan. 2001.
[31] J. Rushby, "Software Verification and System Assurance," Proc. Seventh Int'l Conf. Software Eng. and Formal Methods, D.V. Hung and P. Krishnan, eds., pp. 3-10, Nov. 2009.
[32] B. Littlewood, J. Rushby, and L. Strigini, "On the Nature of Software Assurance," technical report, Computer Science Laboratory, SRI Int'l, Menlo Park, Calif., 2012.
[33] B. Littlewood and A. Povyakalo, "On Claims for the Perfection of Software," technical report, Centre for Software Reliability, City Univ., Jan. 2010.
[34] B. Littlewood and A. Povyakalo, "Conservative Reasoning about Epistemic Uncertainty for the Probability of Failure on Demand of a 1oo2 Software-Based System in Which One Channel Is 'Possibly Perfect'," technical report, Centre for Software Reliability, City Univ., Jan. 2010.
[35] S.M. Ross, Stochastic Processes. Wiley, 1996.
[36] E. Lloyd and W. Tye, Systematic Safety: Safety Assessment of Aircraft Systems. Civil Aviation Authority, 1992.
[37] CAST Position Paper 24: Reliance on Development Assurance Alone when Performing a Complex and Full-Time Critical Function, Certification Authorities Software Team (CAST), http://www.faa.gov/aircraft/air_cert/design_approvals/ air_software/castcast_papers /, Mar. 2006.
[38] K. Havelund and G. Rosu, "Efficient Monitoring of Safety Properties," Software Tools for Technology Transfer, vol. 6, no. 2, pp. 158-173, Aug. 2004.
[39] H. Barringer, D. Rydeheard, and K. Havelund, "Rule Systems for Run-Time Monitoring: From EAGLE to RULER," Proc. Int'l Conf. Runtime Verification, pp. 111-125, Mar. 2007.
[40] Report on the Incident to Airbus A340-642, Registration G-VATL En-Route from Hong Kong to London Heathrow on 8 Feb. 2005, UK Air Investigations Branch, http://www.aaib.gov.uk/publications/formal_reports 4_2007_g_vatl.cfm, 2007.
[41] Safety Recommendations A-98-3 through -5, Nat'l Transportation Safety Board, Washington, D.C., http://www.ntsb.gov/Recs/letters/1998A98_3_5.pdf , Jan. 1998.
[42] A.S. Willsky, "A Survey of Methods for Failure Detection in Dynamic Systems," Automatica, vol. 12, no. 6, pp. 601-611, Nov. 1976.
[43] R.D. Schlichting and F.B. Schneider, "Fail-Stop Processors: An Approach to Designing Fault-Tolerant Computing Systems," ACM Trans. Computer Systems vol. 1, no. 3, pp. 222-238, Apr. 1983.
[44] J. Rushby, "Kernels for Safety?" Proc. Symp. Safe and Secure Computing Systems, T. Anderson, ed., chapter 13, pp. 210-220, Oct. 1986.-1989.
[45] K.G. Wika and J.C. Knight, "On the Enforcement of Software Safety Policies," Proc. 10th Ann. Conf. Computer Assurance, pp. 83-93, June 1995.
[46] A. Arora and S.S. Kulkarni, "Designing Masking Fault Tolerance via Nonmasking Fault Tolerance," IEEE Trans. Software Eng., vol. 24, no. 6, pp. 435-450, June 1998.
[47] F. Schneider, "Enforceable Security Policies," ACM Trans. Information and System Security, vol. 3, no. 1, pp. 30-50, Feb. 2000.
25 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool