Issue No. 02 - February (1985 vol. 11)
R.K. Iyer , Computer Systems Group at the Coordinated Science Laboratory and the Department of Electrical and Computer Engineering, University of Illinois
This paper describes an analysis of hardware-related software (HW/SW) errors on an MVS/SP operating system at Stanford University. The analysis procedure demonstrates a methodology for evaluating the interaction between hardware and software as it relates to system reliability. The paper examines the operating system's handling of HW/SW errors and also the effectiveness of recovery management. Nearly 35 percent of all observed software failures were found to be hareware-related. The analysis shows that the operating system is seldom able to diagnose that a software error may be hardware-related. The impact of HW/SW errors on the system is evaluated by measuring the effectiveness of system recovery in containing the propagation of HW/SW errors. The system failure probability for HW/SW errors is close to three times that for software errors in general. The observed HW/SW errors are seen to have a specific pattern, suggesting the possibility of the use of such error patterns for intelligent error prediction and recovery.
software reliability, Hardware/software interactions, recovery analysis
R. Iyer and P. Velardi, "Hardware-Related Software Errors: Measurement and Analysis," in IEEE Transactions on Software Engineering, vol. 11, no. , pp. 223-231, 1985.