Analyzing cache performance bottlenecks of STM applications and addressing them with compiler's help
2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT) (2010)
Sept. 11, 2010 to Sept. 15, 2010
DOI Bookmark: http://doi.ieeecomputersociety.org/
Sandya Mannarswamy , Indian Institute of Science & HP India
R. Govindarajan , Indian Institute of Science, Bangalore, India
Software transactional memory (STM) is a promising programming paradigm for shared memory multithreaded programs as an alternative to traditional lock based synchronization. However adoption of STM in mainstream software has been quite low due to its considerable overheads and its poor cache/memory performance. In this paper, we perform a detailed study of the cache behavior of STM applications and quantify the impact of different STM factors on the cache misses experienced by the applications. Based on our analysis, we propose a compiler driven Lock-Data Colocation (LDC), targeted at reducing the cache overheads on STM. We show that LDC is effective in improving the cache behavior of STM applications by reducing the dcache miss latency and improving execution time performance.
compiler, cache, software transactional memory
S. Mannarswamy and R. Govindarajan, "Analyzing cache performance bottlenecks of STM applications and addressing them with compiler's help," 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT), Vienna, Austria, 2010, pp. 547-548.