Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques (2011)
Galveston, Texas USA
Oct. 10, 2011 to Oct. 14, 2011
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/PACT.2011.26
Future scalable multi-core chips are expected to implement a shared last-level cache (LLC) with banks distributed on chip, forcing a core to incur non-uniform access latencies to each bank. Consequently, high performance and energy efficiency depend on whether a thread's data is placed in local or nearby banks. Using compiler and programmer support, we aim to find an alternative solution to existing high-overhead designs. In this paper, we take existing parallel programs written in Pthreads, and show the performance gap between current static mapping schemes, costly migration schemes and idealized static and dynamic best-case scenarios.
Data locality, static-NUCA, Performance
Gagandeep S. Sachdev, Kshitij Sudan, Mary W. Hall, Rajeev Balasubramonian, "Understanding the Behavior of Pthread Applications on Non-Uniform Cache Architectures", Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, vol. 00, no. , pp. 175-176, 2011, doi:10.1109/PACT.2011.26