2015 International Conference on Parallel Architecture and Compilation (PACT) (2015)
San Francisco, CA, USA
Oct. 18, 2015 to Oct. 21, 2015
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/PACT.2015.11
A traditional least-recently used (LRU) cache replacement policy fails to achieve the performance of the optimal replacement policy when cache blocks with diverse reuse characteristics interfere with each other. When multiple applications share a cache, it is often partitioned among the applications because cache blocks show similar reuse characteristics within each application. In this paper, we extend the idea to a single application by viewing a cache as a shared resource between individual memory instructions. To that end, we propose Instruction-based LRU (ILRU), a fine grain cache partitioning that way-partitions individual cache sets based on per-instruction working blocks, which are cache blocks required by an instruction to satisfy all the reuses within a set. In ILRU, a memory instruction steals a block from another only when it requires more blocks than it currently has. Otherwise, a memory instruction victimizes among the cache blocks inserted by itself. Experiments show that ILRU can improve the cache performance in all levels of cache, reducing the number of misses by an average of 7.0% for L1, 9.1% for L2, and 8.7% for L3, which results in a geometric mean performance improvement of 5.3%. ILRU for a three-level cache hierarchy imposes a modest 1.3% storage overhead over the total cache size.
Position measurement, Interference, Memory management, Hardware, Monitoring, Parallel architectures, Approximation algorithms
J. J. Park, Y. Park and S. Mahlke, "Fine Grain Cache Partitioning Using Per-Instruction Working Blocks," 2015 International Conference on Parallel Architecture and Compilation (PACT), San Francisco, CA, USA, 2015, pp. 305-316.