This Article 
 Bibliographic References 
 Add to: 
High-Performance Fault-Tolerant VLSI Systems Using Micro Rollback
April 1990 (vol. 39 no. 4)
pp. 548-554

A technique called micro rollback, which allows most of the performance penalty for concurrent error detection to be eliminated, is presented. Detection is performed in parallel with the transmission of information between modules, thus removing the delay for detection from the critical path. Erroneous information may thus reach its destination module several clock cycles before an error indication. Operations performed on this erroneous information are undone using a hardware mechanism for fast rollback of a few cycles. The implementation of a VLSI processor capable of micro rollback is discussed, as well as several critical issues related to its use in a complete system.

[1] L. Censier and P. Feautrier, "A new solution to coherence problems in multicache systems,"IEEE Trans. Comput., vol. C-27, no. 12, pp. 1112-1118, Dec. 1978.
[2] M. L. Ciacelli, "Fault handling on the IBM 4341 processor," inProc. 11th Fault-Tolerant Comput. Symp., Portland, ME, June 1981, pp. 9-12.
[3] L. Lamport, "How to make a multiprocessor computer that correctly executes multiprocess programs,"IEEE Trans. Comput., vol. C-28, no. 9, pp. 690-691, Sept. 1979.
[4] W. W. Hwu and Y. N. Patt, "Checkpoint repair for out-of-order execution machines," inProc. 14th Annu. Symp. Comput. Architecture, Pittsburgh, PA, June 1987, pp. 18-26.
[5] J. H. Patel and L. Y. Fung, "Concurrent error detection in ALU's by recomputing with shifted operands,"IEEE Trans. Comput., vol. C-31, no. 7, pp. 589-595, July 1982.
[6] B. Randell, P.A. Lee, and P.C. Treleaven, "Reliability Issues in Computer System Design,"ACM Computing Surveys, Vol. 28, No. 2, Apr. 1978, pp. 123-165.
[7] R. W. Sherburne, M. G. H. Katevenis, D. A. Patterson, and C. H. Séquin, "A 32-Bit NMOS microprocessor with a large register file,"IEEE J. Solid-State Circuits, vol. SC-19, no. 5, pp. 682-689, Oct. 1984.
[8] J. E. Smith and A. R. Pleszkun, "Implementing precise interrupts in pipelined processors,"IEEE Trans. Comput., vol. C-37, no. 5, pp. 562-573, May 1988.
[9] Y. Tamir, M. Tremblay, and D. A. Rennels, "The implementation and application of micro rollback in fault-tolerant VLSI systems," inProc. 18th Int. Symp. Fault-Tolerant Computing, 1988, pp. 234-239.

Index Terms:
fault-tolerant VLSI systems; micro rollback; concurrent error detection; hardware mechanism; error detection; fault tolerant computing; VLSI.
Y. Tamir, M. Tremblay, "High-Performance Fault-Tolerant VLSI Systems Using Micro Rollback," IEEE Transactions on Computers, vol. 39, no. 4, pp. 548-554, April 1990, doi:10.1109/12.54848
Usage of this product signifies your acceptance of the Terms of Use.