loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2009 24th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems
Resilience Challenges for Exascale Systems
Chicago, Illinois
October 07-October 09
ISBN: 978-0-7695-3839-6
The combination of decreasing device reliability due to deep submicron scaling, increasing integration, and the size of future exascale high-performance computers and cloud datacenters pose significant challenges for system resilience. Furthermore, with power and cost being of critical importance, resilience must be provided efficiently and economically. Although providing resilience will require a range of approaches at all levels of the system stack, the final responsibility rests at the system level. In addition to highlighting challenges, this talk reviews and introduces promising system-level techniques such as configurable isolation, duplication caching, multicore DIMMs, CoVeRT, and 3D checkpointing.
Index Terms:
Resilience, exascale systems, isolation, duplication, checkpointing
Citation:
Norman Paul Jouppi, "Resilience Challenges for Exascale Systems," dft, pp.379, 2009 24th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems, 2009
Usage of this product signifies your acceptance of the Terms of Use.