The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.02 - July-Dec. (2012 vol.11)
pp: 61-64
Justin Meza , Carnegie Mellon University, Pittsburgh
Jichuan Chang , Hewlett-Packard Labs, Palo Alto
HanBin Yoon , Carnegie Mellon University, Pittsburgh
Onur Mutlu , Carnegie Mellon University, Pittsburgh
Parthasarathy Ranganathan , Hewlett-Packard Labs, Palo Alto
ABSTRACT
Hybrid main memories composed of DRAM as a cache to scalable non-volatile memories such as phase-change memory (PCM) can provide much larger storage capacity than traditional main memories. A key challenge for enabling high-performance and scalable hybrid memories, though, is efficiently managing the metadata (e.g., tags) for data cached in DRAM at a fine granularity. Based on the observation that storing metadata off-chip in the same row as their data exploits DRAM row buffer locality, this paper reduces the overhead of fine-granularity DRAM caches by only caching the metadata for recently accessed rows on-chip using a small buffer. Leveraging the flexibility and efficiency of such a fine-granularity DRAM cache, we also develop an adaptive policy to choose the best granularity when migrating data into DRAM. On a hybrid memory with a 512MB DRAM cache, our proposal using an 8KB on-chip buffer can achieve within 6% of the performance of, and 18% better energy efficiency than, a conventional 8MB SRAM metadata store, even when the energy overhead due to large SRAM metadata storage is not considered.
INDEX TERMS
Random access memory, Phase change materials, System-on-a-chip, Buffer storage, Bandwidth, Memory management, Cache memory, non-volatile memories, Random access memory, Phase change materials, System-on-a-chip, Buffer storage, Bandwidth, Memory management, Indexes, hybrid main memories, Cache memories, tag storage
CITATION
Justin Meza, Jichuan Chang, HanBin Yoon, Onur Mutlu, Parthasarathy Ranganathan, "Enabling Efficient and Scalable Hybrid Memories Using Fine-Granularity DRAM Cache Management", IEEE Computer Architecture Letters, vol.11, no. 2, pp. 61-64, July-Dec. 2012, doi:10.1109/L-CA.2012.2
REFERENCES
1. X. Dong,Y. Xie,N. Muralimanohar,, and N. P. Jouppi., Simple but effective heterogeneous main memory with on-chip memory controller support. SC ′10.
2. S. Eyerman and L. Eeckhout., System-level performance metrics for multiprogram workloads. IEEE Micro, 2008.
3. K. Inoue,K. Kai,, and K. Murakami., Dynamically variable line­-size cache exploiting high on-chip memory bandwidth of merged DRAM/logic LSIs. HPCA ′99.
4. X. Jiang,N. Madan,L. Zhao,M. Upton,R. Iyer,S. Makineni,D. Newell,Y. Solihin,, and R. Balasubramonian., CHOP: Adaptive filter-based DRAM caching for CMP server platforms. HPCA ′10.
5. T. L. Johnson and W.-m. , W. Hwu., Run-time adaptive cache hierarchy management via reference analysis. ISCA ′97.
6. B. C. Lee,E. Ipek,O. Mutlu,, and D. Burger., Architecting phase chance memory as a scalable DRAM alternative. ISCA ′09.
7. J. Liptay,Structural aspects of the System/360 Model 85, II: The cache. IBM Syst. J., 1968.
8. C. Loh and M. D. Hill., Efficiently enabling conventional block sizes for very large die-stacked DRAM caches. MICRO ′11.
9. J. A. Mandelman,R. H. Dennard,C. B. Bronner,J. K. DeBrosse,R. Divakaruni,Y. Li,, and C. J. Radens., Challenges and future directions for the scaling of dynamic random-access memory (DRAM). IBM J. Res. Dev., 2002.
10. M. K. Qureshi,D. N. Lynch,O. Mutlu,, and Y. N. Patt., A case for MLP-aware cache replacement. ISCA ′06.
11. M. K. Qureshi,V. Srinivasan,, and J. A. Rivers., Scalable high performance main memory system using phase-change memory technology. ISCA ′09.
12. S. Rixner,W. J. Dally,U. J. Kapasi,P. Mattson,, and J. D. Owens., Memory access scheduling. TSCA ′00.
13. A. Seznec., Decoupled sectored caches: conciliating low tag implementation cost and low miss ratio. ISCA ′94.
14. H. Wang,T. Sun,, and Q. Yang., CAT - caching address tags - a technique for reducing area cost of on-chip caches. ISCA ′95.
15. L. Zhao,R. Iyer,R. Illikkal,, and D. Newell., Exploring DRAM cache architectures for CMP server platforms. ICCD ′07.
16. W. K. Zuravleff and T. Robinson., Controller for a synchronous DRAM that maximizes throughput by allowing memory requests and commands to be issued out of order. U.S. patent 5630096, ′97.
46 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool