The Community for Technology Leaders
Parallel and Distributed Processing Symposium, International (2010)
Atlanta, GA, USA
Apr. 19, 2010 to Apr. 23, 2010
ISBN: 978-1-4244-6442-5
pp: 1-12
Konrad Malkowski , Department of Computer Science and Engineering, The Pennsylvania State University, University Park, PA 16802, USA
Padma Raghavan , Department of Computer Science and Engineering, The Pennsylvania State University, University Park, PA 16802, USA
Mahmut Kandemir , Department of Computer Science and Engineering, The Pennsylvania State University, University Park, PA 16802, USA
ABSTRACT
As chip transistor densities continue to increase, soft errors (bit flips) are becoming a significant concern in networked multiprocessors with multicore nodes. Large cache structures in multicore processors are especially susceptible to soft errors as they occupy a significant portion of the chip area. In this paper, we consider the impacts of soft errors in caches on the resilience and energy efficiency of sparse linear solvers. In particular, we focus on two widely used sparse iterative solvers, namely Conjugate Gradient (CG) and Generalized Minimum Residuals (GMRES). We propose two adaptive schemes, (i) a Write Eviction Hybrid ECC (WEH-ECC) scheme for the L1 cache and (ii) a Prefetcher Based Adaptive ECC (PBA-ECC) scheme for the L2 cache, and evaluate the energy and reliability trade-offs they bring in the context of GMRES and CG solvers. Our evaluations indicate that WEH-ECC reduces the CG and GMRES soft error vulnerability by a factor of 18 to 220 in L1 cache, relative to an unprotected L1 cache, and energy consumption by 16%, relative to a cache with strong protection. The PBA-ECC scheme reduces the CG and GMRES soft error vulnerability by a factor of 9 - 10
INDEX TERMS
CITATION

P. Raghavan, K. Malkowski and M. Kandemir, "Analyzing the soft error resilience of linear solvers on multicore multiprocessors," 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), Atlanta, GA, 2010, pp. 1-12.
doi:10.1109/IPDPS.2010.5470411
97 ms
(Ver 3.3 (11022016))