The Community for Technology Leaders
Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques (2005)
St. Louis, Missouri
Sept. 17, 2005 to Sept. 21, 2005
ISSN: 1089-795X
ISBN: 0-7695-2429-X
pp: 315-328
Michael C. Huang , Department of Electrical & Computer Engineering University of Rochester
Edwin J. Tan , Department of Electrical & Computer Engineering University of Rochester
M. Wasiur Rashid , Department of Electrical & Computer Engineering University of Rochester
David H. Albonesi , Computer Systems Laboratory Cornell University
ABSTRACT
<p>As device dimensions continue to be aggressively scaled, microprocessors are becoming increasingly vulnerable to the impact of undesired energy, such as that of a cosmic particle strike, which can cause transient errors. To prevent operational failure due to these errors, system-level techniques such as redundant execution will be increasingly required for fault detection and tolerance in future processors. However, the need for redundancy is directly opposed to the growing need for more power efficient operation. Conventional techniques that use multi-core microarchitectures to provide whole-thread duplication generally incur significant energy overhead which can exacerbate the already severe problem of power consumption and heat dissipation given a certain throughput requirement. In the future, approaches that supply the necessary level of robustness at a given throughput level must also be power-aware.</p> <p>We propose a thread-level redundant execution microarchitecture that significantly reduces the energy overhead of replication without unduly impacting performance. Our approach exploits the fact that with appropriate hardware support, the verification operation can be parallelized and run on a chip multiprocessor with support for frequency scaling together with supply voltage scaling and/or body biasing. To further improve the efficiency of verification, we exploit the information obtained by the leading thread to assist the trailing verification threads. We discuss in detail the required architectural support and show that our approach can be highly energy-efficient: using two checkers, fully replicated execution costs only an average 28% extra energy over non-redundant execution with virtually no performance loss.</p>
INDEX TERMS
null
CITATION
Michael C. Huang, Edwin J. Tan, M. Wasiur Rashid, David H. Albonesi, "Exploiting Coarse-Grain Verification Parallelism for Power-Efficient Fault Tolerance", Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, vol. 00, no. , pp. 315-328, 2005, doi:10.1109/PACT.2005.20
95 ms
(Ver 3.3 (11022016))