loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
14th IEEE International Symposium on Modeling, Analysis, and Simulation
Characterizing Microarchitecture Soft Error Vulnerability Phase Behavior
Monterey, CA
September 11-September 14
ISBN: 0-7695-2573-3
Xin Fu, University of Florida, USA
James Poe, University of Florida, USA
Tao Li, University of Florida, USA
Jos? A. B. Fortes, University of Florida, USA
Computer systems increasingly depend on exploiting program dynamic behavior to optimize performance, power and reliability. Prior studies have shown that program execution exhibits phase behavior in both performance and power domains. Reliabilityoriented program phase behavior, however, remains largely unexplored. As semiconductor transient faults (soft errors) emerge as a critical challenge to reliable system design, characterizing program phase behavior from a reliability perspective is crucial in order to apply dynamic fault-tolerant mechanisms and to optimize performance/reliability trade-offs.

In this paper, we compute run-time program vulnerability to soft errors on four microarchitecture structures (i.e. instruction window, reorder buffer, function units and wakeup table) in a high-performance out-of-order execution superscalar processor. Experimental results on the SPEC2000 benchmarks show a considerable amount of time varying behavior in reliability measurements. Our study shows that a single performance metric, such as IPC, cache miss or branch misprediction, is not a good indicator for program vulnerability. The vulnerabilities of the studied microarchitecture structures are then correlated with program code-structure and run-time events to identify vulnerability phase behavior. We observed that both program code-structure and run-time events appear promising in classifying program reliability phase behavior. Overall, performance counter based schemes achieved an average Coefficient of Variation (COV) of 3.5%, 4.5%, 4.3% and 5.7% on the instruction queue, reorder buffer, function units and the wakeup table, while basic block vectors offer COVs of 4.9%, 5.8%, 5.4% and 6% on the four studied microarchitecture structures respectively. We found that in general, tracking performance metrics performs better than tracking control flow in identifying reliability phase behavior of applications. To our knowledge, this paper is the first to characterize program reliability phase behavior at the microarchitecture level.

Citation:
Xin Fu, James Poe, Tao Li, Jos? A. B. Fortes, "Characterizing Microarchitecture Soft Error Vulnerability Phase Behavior," mascots, pp.147-155, 14th IEEE International Symposium on Modeling, Analysis, and Simulation, 2006
Usage of this product signifies your acceptance of the Terms of Use.