The Community for Technology Leaders
Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques (2002)
Charlottesville, Virginia
Sept. 22, 2002 to Sept. 25, 2002
ISSN: 1089-795X
ISBN: 0-7695-1620-3
pp: 155
Manuel E. Acacio , Universidad de Murcia
José González , Intel Barcelona Research Center
José M. García , Universidad de Murcia
José Duato , Universidad Politécnica de Valencia
This work is focused on accelerating upgrade misses in cc-NUMA multiprocessors. These misses are caused by store instructions for which a read-only copy of the line is found in the L2 cache. Upgrade misses require a message sent from the missing node to the directory, a directory lookup in order to find the set of sharers, invalidation messages being sent to the sharers and responses to the invalidations being sent back. Therefore, the penalty paid by these misses is not negligible, mainly if we consider that they account for a high percentage of the total miss rate. We propose the use of prediction as a means of providing cc-NUMA multiprocessors with a more efficient support for upgrade misses by directly invalidating sharers from the missing node. Our proposal comprises an effective prediction scheme achieving high hit rates as well as a coherence protocol extended to support the use of prediction. Our work is motivated by two key observations: first, upgrade misses present a repetitive behavior and, second, the total number of sharers being invalidated is small (one, in some cases). Using execution-driven simulations, we show that the use of prediction can significantly accelerate upgrade misses (latency reductions of more than 40% in some cases). These important improvements translate into speed-ups on application performance up to 14%. Finally, these results can be obtained including a predictor with a total size of less than 48 KB in every node.
Manuel E. Acacio, José González, José M. García, José Duato, "The Use of Prediction for Accelerating Upgrade Misses in cc-NUMA Multiprocessors", Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, vol. 00, no. , pp. 155, 2002, doi:10.1109/PACT.2002.1106014
94 ms
(Ver 3.3 (11022016))