2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT) (2012)
Minneapolis, MN, USA
Sept. 19, 2012 to Sept. 23, 2012
DOI Bookmark: http://doi.ieeecomputersociety.org/
Yong Li , Department of ECE, University of Pittsburgh, PA, 15261, USA
Rami Melhem , Department of CS, University of Pittsburgh, PA, 15260, USA
Alex K. Jones , Department of ECE, University of Pittsburgh, PA, 15261, USA
State-of-the-art chip multiprocessor (CMP) proposals emphasize optimization to deliver computing power across many types of applications. Potentially significant performance improvements that leverage application specific characteristics such as data access behavior are missed by this approach. In this paper, we demonstrate that using fairly simple and inexpensive static analysis, data can be classified into private and shared. In addition, we develop a novel compiler-based approach to speculatively detect a third classification: practically private. We demonstrate that practically private data is ubiquitous in parallel applications and leveraging this classification provides opportunities to benefit performance. While this proposed data classification scheme can be applied to many micro-architectural constructs including the TLB, coherence directory and interconnect, we demonstrate its potential through an efficient cache coherence design. Specifically, we show that the compiler-assisted mechanism reduces an average of 46% coherence traffic and achieves up to 13%, 9%, and 5% performance improvement over shared, private, and state-of-the-art NUCA-based caching, respectively depending on scenarios.
Coherence, Instruction sets, Benchmark testing, Runtime, Resource management, Optimization, Computer architecture
Y. Li, R. Melhem and A. K. Jones, "Practically Private: Enabling high performance CMPs through compiler-assisted data classification," 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT), Minneapolis, MN, USA, 2012, pp. 231-240.