The Community for Technology Leaders
2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW) (2010)
Atlanta, GA, USA
Apr. 19, 2010 to Apr. 23, 2010
ISBN: 978-1-4244-6533-0
pp: 1-8
Konrad Malkowski , Department of Computer Science and Engineering, The Pennsylvania State University, University Park, 16802, USA
Padma Raghavan , Department of Computer Science and Engineering, The Pennsylvania State University, University Park, 16802, USA
Mahmut Kandemir , Department of Computer Science and Engineering, The Pennsylvania State University, University Park, 16802, USA
Mary Jane Irwin , Department of Computer Science and Engineering, The Pennsylvania State University, University Park, 16802, USA
ABSTRACT
We consider a non-uniform access latency cache architecture (NUCA) design for 3D chip multi-processors (CMPs) where cache structures are divided into small banks interconnected by a network-on-chip (NoC). In earlier NUCA designs, data is placed in banks either statically (S-NUCA) or dynamically (D-NUCA). In both S-NUCA and D-NUCA designs, scaling to hundreds of cores can pose several challenges. Thus, we propose a new NUCA architecture with an inclusive, octal tree-based, hierarchical directory (T-NUCA-8), with the potential to scale to hundreds of cores with performance comparable to D-NUCA at a fraction of the energy cost. Our evaluations indicate that relative to D-NUCA, our T-NUCA-8 reduces network usage by 92%, energy by 87%, and EDP by 87%, at performance cost of 10%.
INDEX TERMS
CITATION

K. Malkowski, M. Kandemir, M. J. Irwin and P. Raghavan, "T-NUCA - a novel approach to non-uniform access latency cache architectures for 3D CMPs," 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), Atlanta, GA, USA, 2010, pp. 1-8.
doi:10.1109/IPDPSW.2010.5470910
95 ms
(Ver 3.3 (11022016))