This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Argus: Low-Cost, Comprehensive Error Detection in Simple Cores
January/February 2008 (vol. 28 no. 1)
pp. 52-59
Albert Meixner, Duke University
Michael E. Bauer, Duke University
Daniel J. Sorin, Duke University
Argus, a novel approach for detecting errors in simple processor cores, dynamically verifies the correctness of the four tasks performed by a von Neumann core: control flow, data flow, computation, and memory access. Argus detects transient and permanent errors, with far lower impact on performance and chip area than previous techniques.

1. International Technology Roadmap for Semiconductors, 2003; http:/www.itrs.net.
2. A. Meixner, M.E. Bauer, and D.J. Sorin, "Argus: Low-Cost, Comprehensive Error Detection in Simple Cores," Proc. 40th Ann. IEEE/ACM Int'l Symp. Microarchitecture (MICRO 07), IEEE CS Press, 2007, pp. 210-222.
3. T.M. Austin, "DIVA: A Dynamic Approach to Microprocessor Verification," J. Instruction-Level Parallelism, vol. 2, May 2000; http://www.jilp.org/vol2v2paper7.pdf.
4. X. Delord and G. Saucier, "Formalizing Signature Analysis for Control Flow Checking of Pipelined RISC Microprocessors," Proc. Int'l Test Conf. (ITC 91), IEEE Press, 1991, pp. 936-945.
5. A. Meixner and D.J. Sorin, "Error Detection Using Dynamic Dataflow Verification," Proc. Int'l Conf. Parallel Architectures and Compilation Techniques (PACT 07), IEEE CS Press, Sept. 2007, pp. 104-118.
6. F.F. Sellers, M.-Y Hsiao, and L.W. Bearnson, Error Detecting Logic for Digital Computers, McGraw Hill Book Company, 1968.
7. A. Meixner and D.J. Sorin, "Dynamic Verification of Memory Consistency in Cache-Coherent Multithreaded Computer Architectures," Proc. Int'l Conf. Dependable Systems and Networks, (DSN 06), IEEE CS Press, 2006, pp. 73-82.
8. D. Lampret OpenRISC 1200 IP Core Specification, rev. 0.7, Sept. 2001, http:/www.opencores.org.
9. A. Mahmood and E. McCluskey, "Watchdog Processors: Error Coverage and Overhead," Proc. 15th Int'l Symp. Fault-Tolerant Computing Systems (FTCS 85), IEEE Press, 1985, pp. 214-219.
10. J.B. Sulistyo, J. Perry, and D.S. Ha, "Developing Standard Cells for TSMC 0.25 &SetFont Typeface="11";µ&SetFont Typeface="46";m Technology under MOSIS DEEP Rules," tech. report VISC-2003-01, Dept. of Electrical and Computer Engineering, Virginia Polytechnic Institute and State Univ., 2003.
11. S.J. Wilton and N.P. Jouppi, "An Enhanced Access and Cycle Time Model for On-Chip Caches," research report 93/5, DEC Western Research Laboratory, July 1994; http://www.hpl.hp.com/techreports/Compaq-DEC WRL-93-5.pdf.
12. C. Lee, M. Potkonjak, and W.H. Mangione-Smith, "MediaBench: A Tool for Evaluating and Synthesizing Multimedia and Communications Systems," Proc. 30th Ann. IEEE/ACM Int'l Symp. Microarchitecture (MICRO 97), IEEE CS Press, 1997, pp. 330-335.
1. C. Weaver and T. Austin, "A Fault Tolerant Approach to Microprocessor Design," Proc. Int'l Conf. Dependable Systems and Networks (DSN 01), IEEE CS Press, 2001, pp. 411-420.
2. T.M. Austin, "DIVA: A Dynamic Approach to Microprocessor Verification," J. Instruction-Level Parallelism, vol. 2, May 2000, http://www.jilp.org/vol2v2paper7.pdf
3. E. Rotenberg, "AR-SMT: A Microarchitectural Approach to Fault Tolerance in Microprocessors," Proc. 29th Int'l Symp. Fault-Tolerant Computing Systems (FTCS 99), IEEE CS Press, 1999, pp. 84-91.
4. S.S. Mukherjee, M. Kontz, and S.K. Reinhardt, "Detailed Design and Implementation of Redundant Multithreading Alternatives," Proc. 29th Ann. Int'l Symp. Computer Architecture (ISCA 02), IEEE CS Press, 2002, pp. 99-110.
5. S. Shyam et al., "Ultra Low-Cost Defect Protection for Microprocessor Pipelines," Proc. 12th Int'l Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS 06), ACM Press, 2006, pp. 73-82.
6. N. Oh, P.P. Shirvani, and E.J. McCluskey, "Error Detection by Duplicated Instructions in Super-Scalar Processors," IEEE Trans. Reliability, vol. 51, no. 1, Mar 2002, pp. 63-74.
7. G.A. Reis et al., "SWIFT: Software Implemented Fault Tolerance," Proc. Int'l Symp. Code Generation and Optimization (CGO 05), IEEE CS Press, 2005, pp. 243-254.

Index Terms:
microarchitecture, error detection, dependability, fault tolerance
Citation:
Albert Meixner, Michael E. Bauer, Daniel J. Sorin, "Argus: Low-Cost, Comprehensive Error Detection in Simple Cores," IEEE Micro, vol. 28, no. 1, pp. 52-59, Jan.-Feb. 2008, doi:10.1109/MM.2008.3
Usage of this product signifies your acceptance of the Terms of Use.