2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT) (2010)
Sept. 11, 2010 to Sept. 15, 2010
DOI Bookmark: http://doi.ieeecomputersociety.org/
Yong Li , Department of ECE, University of Pittsburgh, Benedum Hall, PA, 15261, USA
Rami Melhem , Department of CS, University of Pittsburgh, Sennott Square, PA, 15260, USA
Ahmed Abousamra , Department of CS, University of Pittsburgh, Sennott Square, PA, 15260, USA
Alex K. Jones , Department of ECE, University of Pittsburgh, Benedum Hall, PA, 15261, USA
Data access latency, a limiting factor in the performance of chip multiprocessors, grows significantly with the number of cores in non-uniform cache architectures with distributed cache banks. To mitigate this effect, it is necessary to leverage the data access locality and choose an optimum data placement. Achieving this is especially challenging when other constraints such as cache capacity, coherence messages and runtime overhead need to be considered. This paper presents a compiler-based approach used for analyzing data access behavior in multi-threaded applications. The proposed experimental compiler framework employs novel compilation techniques to discover and represent multi-threaded memory access patterns (MMAPs). At run time, symbolic MMAPs are resolved and used by a partitioning algorithm to choose a partition of allocated memory blocks among the forked threads in the analyzed application. This partition is used to enforce data ownership by associating the data with the core that executes the thread owning the data. We demonstrate how this information can be used in an experimental architecture to accelerate applications. In particular, our compiler assisted approach shows a 20% speedup over shared caching and 5% speedup over the closest runtime approximation, “first touch”.
compiler-assisted caching, partitioning, data distribution
Y. Li, R. Melhem, A. Abousamra and A. K. Jones, "Compiler-assisted data distribution for chip multiprocessors," 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT), Vienna, Austria, 2010, pp. 501-512.