The Community for Technology Leaders
Green Image
ABSTRACT
Recent studies have shown that in highly associative caches, the performance gap between the Least Recently Used (LRU) and the theoretical optimal replacement algorithms is large, motivating the design of alternative replacement algorithms to improve cache performance. In LRU replacement, a line, after its last use, remains in the cache for a long time until it becomes the LRU line. Such dead lines unnecessarily reduce the cache capacity available for other lines. In addition, in multi-level caches, temporal reuse patterns are often inverted, showing in the L1 cache, but due to the filtering effect of the L1 cache, not showing in the L2 cache. At the L2, these lines appear to be brought in the cache but are never used until they are replaced. These lines unnecessarily pollute the L2 cache. This paper proposes a new counter-based approach to deal with the problems. For the former problem, we predict lines that have become dead, and replace them early from the L2 cache. For the latter problem, we identify never-used lines, bypass the L2 cache, and directly place them in the L1 cache. Both techniques are achieved through a single counter-based mechanism. In our approach, each line in the L2 cache is augmented with an event counter that is incremented when an event of interest, such as certain cache accesses, occurs. When the counter reaches a threshold, the line "expires", and becomes replaceable. Each line's threshold is unique and is dynamically learned. We propose and evaluate two new replacement algorithms: Access Interval Predictor (AIP) and Live-time Predictor (LvP). AIP and LvP speed up 10 capacity-constrained SPEC2000 benchmarks by up to 40%, and 11% on average. Cache bypassing further reduce L2 cache pollution, and improve the average speedups to 13-14%.
INDEX TERMS
Cache memories, Cache Replacement, Cache Bypassing, Counter-Based Algorithms, Cache Misses
CITATION
Yan Solihin, Mazen Kharbutli, "Counter-Based Cache Replacement and Bypassing Algorithms", IEEE Transactions on Computers, vol. 57, no. , pp. 433-447, April 2008, doi:10.1109/TC.2007.70816
96 ms
(Ver 3.1 (10032016))