The Community for Technology Leaders
Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques (2003)
New Orleans, Louisiana
Sept. 27, 2003 to Oct. 1, 2003
ISSN: 1089-795X
ISBN: 0-7695-2021-9
pp: 4
Harold W. Cain , University of Wisconsin-Madison
Ravi Nair , IBM T.J. Watson Research Center
Mikko H. Lipasti , University of Wisconsin-Madison
This paper presents a framework for analyzing the performance of multithreaded programs using a model called a constraint graph. We review previous constraint graph definitions for sequentially consistent systems, and extend these definitions for use in analyzing other memory consistency models. Using this framework, we present two constraint graph analysis case studies using several commercial and scientific workloads running on a full system simulator. The first case study illustrates how a constraint graph can be used to determine the necessary conditions for implementing a memory consistency model, rather than conservative sufficient conditions. Uisng this method, we classify coherence misses as either required or unnecessary. We determine that on average one half of all load instructions which suffer cache misses due to coherence activity are unnecessarily stalled because the original copy of the cache line could have been used without violating the memory consistency model. The second case study demonstrates the effects of memory consistency constraints on the fundamental limits of instruction level parallelism, compared to previous estimates which did not include multiprocessor constraints. Using this method we determine the fundamental performance differences of various memory consistency models for processors which do not perform consistency-related speculation.
Harold W. Cain, Ravi Nair, Mikko H. Lipasti, "Constraint Graph Analysis of Multithreaded Programs", Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, vol. 00, no. , pp. 4, 2003, doi:10.1109/PACT.2003.1237997
96 ms
(Ver 3.3 (11022016))