loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2007 IEEE 13th International Symposium on High Performance Computer Architecture
Error Detection via Online Checking of Cache Coherence with Token Coherence Signatures
Scottsdale, AZ, USA
February 10-February 14
ISBN: 1-4244-0804-0
Albert Meixner, Dept. of Computer Science, Duke University, albert@cs.duke.edu
Daniel J. Sorin, Dept. of Electrical and Computer Engineering, Duke University, sorin@ee.duke.edu
To provide high dependability in a multithreaded system despite hardware faults, the system must detect and correct errors in its shared memory system. Recent research has explored dynamic checking of cache coherence as a comprehensive approach to memory system error detection. However, existing coherence checkers are costly to implement, incur high interconnection network traffic overhead, and do not scale well. In this paper, we describe the Token Coherence Signature Checker (TCSC), which provides comprehensive, low-cost, scalable coherence checking by maintaining signatures that represent recent histories of coherence events at all nodes (cache and memory controllers). Periodically, these signatures are sent to a verifier to determine if an error occurred. TCSC has a small constant hardware cost per node, independent of cache and memory size and the number of nodes. TCSC's interconnect band-width overhead has a constant upper bound and never exceeds 7% in our experiments. TCSC has negligible impact on system performance.
Citation:
Albert Meixner, Daniel J. Sorin, "Error Detection via Online Checking of Cache Coherence with Token Coherence Signatures," hpca, pp.145-156, 2007 IEEE 13th International Symposium on High Performance Computer Architecture, 2007
Usage of this product signifies your acceptance of the Terms of Use.