Issue No. 03 - March (2004 vol. 30)
Analysis of anomalies that occur during operations is an important means of improving the quality of current and future software. Although the benefits of anomaly analysis of operational software are widely recognized, there has been relatively little research on anomaly analysis of safety-critical systems. In particular, patterns of software anomaly data for operational, safety-critical systems are not well understood. This paper presents the results of a pilot study using Orthogonal Defect Classification (ODC) to analyze nearly two hundred such anomalies on seven spacecraft systems. These data show several unexpected classification patterns such as the causal role of difficulties accessing or delivering data, of hardware degradation, and of rare events. The anomalies often revealed latent software requirements that were essential for robust, correct operation of the system. The anomalies also caused changes to documentation and to operational procedures to prevent the same anomalous situations from recurring. Feedback from operational anomaly reports helped measure the accuracy of assumptions about operational profiles, identified unexpected dependencies among embedded software and their systems and environment, and indicated needed improvements to the software, the development process, and the operational procedures. The results indicate that, for long-lived, critical systems, analysis of the most severe anomalies can be a useful mechanism both for maintaining safer, deployed systems and for building safer, similar systems in the future.
Software and system safety, diagnostics, maintenance process, product metrics.
R. R. Lutz and I. Carmen Mikulski, "Empirical Analysis of Safety-Critical Anomalies During Operations," in IEEE Transactions on Software Engineering, vol. 30, no. , pp. 172-180, 2004.