The Community for Technology Leaders
Parallel and Distributed Systems, International Conference on (2011)
Tainan, Taiwan
Dec. 7, 2011 to Dec. 9, 2011
ISSN: 1521-9097
ISBN: 978-0-7695-4576-9
pp: 284-291
As the number of processors sharing a cache increases, conflict misses due to interference amongst competing processes have an increasing impact on the individual performance of processes. Cache partitioning is a method of allocating a cache between concurrently executing processes in order to counteract the effects of inter-process conflicts. However, cache partitioning methods commonly divide a shared cache into private partitions dedicated to a single processor, which can lead to underutilized portions of the cache when set accesses are non-uniform. Our proposed method compliments these cache partitioning algorithms by creating an additional shared partition able to be shared amongst all processors. Underutilized areas of the cache are identified by a monitoring circuit and used for the shared partition. Detection of underutilization is based on the number of unique set accesses for a given allocated way. For a 16-way set associative cache, the implementation of our method requires 64 bytes of storage overhead per core in addition to that needed for the method that determines the sizes of the private partitions. For the tested system, our method is able to improve performance over the traditional LRU policy for a number of selected benchmark sets by an average of 1.4% and up to 13.3% for a two core system and an average of 1.4% and up to 7.8% for a four core system, and is able to improve the performance of a conventional cache partitioning method (Utility-Based Cache Partitioning) by an average of 0.1% and up to 0.5% for both a two and four core systems.
cache partitioning, shared cache, set utilization, chip multi-processor

C. Chung and P. Deayton, "Set Utilization Based Dynamic Shared Cache Partitioning," Parallel and Distributed Systems, International Conference on(ICPADS), Tainan, Taiwan, 2011, pp. 284-291.
172 ms
(Ver 3.3 (11022016))