This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
A Methodology for the Rapid Injection of Transient Hardware Errors
August 1996 (vol. 45 no. 8)
pp. 881-891

Abstract—Ultra-dependable computing demands verification of fault-tolerant mechanisms in the hardware. The most popular class of verification methodologies, fault-injection, is fraught with a host of limitations. Methods which are rapid enough to be feasible are not based on actual hardware faults. On the other hand, methods which are based on gate-level faults require enormous time resources. This research tries to bridge that gap by developing a new fault-injection methodology for processors based on a register-transfer-language (RTL) fault model. The fault model is developed by abstracting the effects of low-level faults to the RTL level. This process attempts to be independent of implementation details without sacrificing coverage, the proportion of errors generated by gate-level faults that are successfully reproduced by the RTL fault model. A prototype tool, ASPHALT, is described which automates the process of generating the error patterns. The IBM RISC-Oriented Micro-Processor (ROMP) is used as a basis for experimentation. Over 1.5 million transient faults are injected using a gate-level model. Over 97% of these are reproduced with the RTL model at a speedup factor of over 500:1. These results show that the RTL fault model may be used to greatly accelerate fault-injection experiments without sacrificing accuracy.

[1] J. Arlat, Y. Crouzet, and J.-C. Laprie, “Fault Injection for Dependability Validation of Fault-Tolerant Computing Systems,” Proc. IEEE Int'l Symp. Fault-Tolerant Computing, pp. 348–355, 1989.
[2] J.H. Barton, E.W. Czeck, Z.Z. Segall, and D.P. Siewiorek, Fault Injection Experiments Using FIAT IEEE Trans. Computers, vol. 39, no. 4, pp. 575-582, Apr. 1990.
[3] D. Brahme and J.A. Abraham, "Functional Testing of Microprocessors," IEEE Trans. Computers, vol. 33, no. 6, pp. 475-485, June 1984.
[4] G.S. Choi, R.K. Iyer, and V. Carreno, “FOCUS: An Experimental Environment for Fault Sensitivity Analysis,” IEEE Trans. Computers, vol. 41, no. 12, pp. 1,515-1,526, Dec. 1992.
[5] J. Cusick, "SEU Vulnerability of the ZILOG Z-80 and NSC-800 Microprocessors," IEEE Trans. Nuclear Science, vol. 32, no. 46, pp. 4,206-4,211, Dec. 1985.
[6] E. Czeck, "On the Prediction of Fault Tolerant Behavior Based On Workload," PhD thesis, Carnegie Mellon Univ., Dec. 1990.
[7] E.W. Czeck and D.P. Siewiorek, "Observations on the Effects of Fault Manifestation as a Function of Workload," IEEE Trans. Computers, Vol. 41, No. 5, 1992, pp. 559-566.
[8] T.R. Dilenno, D.A. Yaskin, and J.H. Barton, "Fault Tolerance Testing in the Advanced Automation System," Proc. 21st Int'l Symp. Fault-Tolerant Computing (FTCS21), pp. 18-25. IEEE CS Press, June 1991.
[9] P. Duba and R.K. Iyer, "Transient Fault Behavior in a Microprocessor: A Case Study," Proc. 1988 IEEE Int'l Conf. Computer Design: VLSI in Computers and Processors, Oct. 1988.
[10] J.B. Dugan, "On Measurement and Modeling of Computer Systems Dependability: A Dialog Among Experts," IEEE Trans. Computers, vol. 39, no. 4, pp. 506-509, Apr. 1990.
[11] U. Gunneflo, J. Karlsson, and J. Torin, "Evaluation of Error Detection Schemes Using Fault Injection by Heavy-Ion Radiation," Proc. Int'l Symp. Fault-Tolerant Computing, pp. 340-347, 1989.
[12] "IBM RT Personal Computer Technology," Technical Report SA23-1057, IBM, 1986.
[13] R.K. Iyer and D.J. Rossetti, "A Measurement-Based Model for Workload Dependence of CPU Errors," IEEE Trans. Computers, vol. 35, no. 6, pp. 511-519, June 1986.
[14] E. Jenn et al., “Fault Injection into VHDL Models: The MEFISTO tool,” Proc. 24th Fault-Tolerant Computer Systems Symp., pp. 66-75, 1994.
[15] G. Kanawati, N. Kanawati, and J. Abraham, “FERRARI: A Tool for the Validation of System Dependability Properties,” Proc. IEEE Int'l Symp. Fault-Tolerant Computing, pp. 336–344, 1992.
[16] G.A. Kanawati, N.A. Kanawati, and J.A. Abraham, "A High-Level Error Model Automatic Extractor," Technical Report UT-CERC-TR-JAA93-01, Computer Eng. Research Center, Univ. of Texas at Austin, Jan. 1993.
[17] T. May and M. Woods, "Alpha-Particle-Induced Soft Errors in Dynamic Memories," IEEE Trans. Electron Devices, vol. 262-9, Jan. 1979.
[18] G. Miremadi, J. Karlsson, J.U. Gunneflo, and J. Torin, “Two Software Techniques for On-Line Error Detection,” Proc. 22nd Ann. Int'l Symo. Fault-Tolerant Computing, pp. 328-335, July 1992.
[19] J. Ohlsson, M. Rimen, and U. Genneflo, "A Study of the Effects of Transient Fault Injection into a 32-bit RISC with Built-In Watchdog," Proc. Int'l Symp. Fault-Tolerant Computing, pp. 316-325, 1992.
[20] G. Ries, G. Choi, and R. Iyer, "Device-Level Transient Fault Modeling," Proc. 24th Int'l Symp. Fault-Tolerant Computing, FTCS-24,Austin, Texas, pp. 76-83, 1994.
[21] M. Rimen, J. Ohlsson, and J. Torin, "On Microprocessor Error Behavior Modeling," Proc. 24th Int'l Symp. Fault-Tolerant Computing FTCS-24,Austin, Texas, pp. 76-85, 1994.
[22] Z. Segall et al., “FIAT—Fault Injection Based Automated Testing Environment,” Proc. IEEE Int'l Symp. Fault-Tolerant Computing, pp. 102–107, 1988.
[23] D. Siewiorek and R. Swarz, Reliable Computer Systems: Design and Evaluation. Digital Press, 1992.
[24] S.M. Thatte and J.A. Abraham, "User Testing of Microprocessors," Spring '79 Compcon, 18th IEEE CS Int'l Conf., pp. 108-114, 1979.
[25] S.M. Thatte and J.A. Abraham, "A Methodology for Functional Level Testing of Microprocessors," Proc. Eighth Int'l Symp. Fault-Tolerant Computing (FTCS8), pp. 90-95. IEEE CS Press, June 1978.
[26] S.M. Thatte and J.A. Abraham, "Test Generation for General Microprocessor Architectures," Proc. Ninth Int'l Symp. Fault-Tolerant Computing (FTCS9), pp. 203-210. IEEE CS Press, June 1979.
[27] L.T. Young, C. Alonso, R.K. Iyer, and K.K. Goswami, "A Hybrid Monitor Assisted Fault Injection Environment," Dependable Computing for Critical Applications (Proc. IFIP Int'l Working Conf. DCCA-3, Palermo, Italy, Sept. 1992), C.E. Landwehr, B. Randell, L. Simoncini, eds., pp. 281-302.Vienna: Springer-Verlag, 1993.
[28] C. Yount,“The automatic generation of instruction-level error manifestations of hardware faults: A new fault injection mode,”Ph.D. dissertation, Carnegie-Mellon Univ., Pittsburg, PA, May 1993.
[29] C.R. Yount and D.P. Siewiorek, "Software-Implemented Fault Injection of Transient Hardware Errors," Foundations of Dependable Computing: Models and Frameworks for Dependable Systems, G.M. Koob, C.G. Law, eds., chap. 3.1, pp. 113-167. Kluwer Academic Publishers, 1994.

Index Terms:
Fault tolerance, hybrid fault emulation, IBM ROMP, register-transfer language modeling, software-implementedfault injection, Verilog modeling.
Citation:
Charles R. Yount, Daniel P. Siewiorek, "A Methodology for the Rapid Injection of Transient Hardware Errors," IEEE Transactions on Computers, vol. 45, no. 8, pp. 881-891, Aug. 1996, doi:10.1109/12.536231
Usage of this product signifies your acceptance of the Terms of Use.