The Community for Technology Leaders
Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques (2011)
Galveston, Texas USA
Oct. 10, 2011 to Oct. 14, 2011
ISSN: 1089-795X
ISBN: 978-0-7695-4566-0
pp: 155-166
ABSTRACT
For parallelism to become tractable for mass programmers, shared-memory languages and environments must evolve to enforce disciplined practices that ban "wild shared-memory behaviors;'' e.g., unstructured parallelism, arbitrary data races, and ubiquitous non-determinism. This software evolution is a rare opportunity for hardware designers to rethink hardware from the ground up to exploit opportunities exposed by such disciplined software models. Such a co-designed effort is more likely to achieve many-core scalability than a software-oblivious hardware evolution. This paper presents DeNovo, a hardware architecture motivated by these observations. We show how a disciplined parallel programming model greatly simplifies cache coherence and consistency, while enabling a more efficient communication and cache architecture. The DeNovo coherence protocol is simple because it eliminates transient states -- verification using model checking shows 15X fewer reachable states than a state-of-the-art implementation of the conventional MESI protocol. The DeNovo protocol is also more extensible. Adding two sophisticated optimizations, flexible communication granularity and direct cache-to-cache transfers, did not introduce additional protocol states (unlike MESI). Finally, DeNovo shows better cache hit rates and network traffic, translating to better performance and energy. Overall, a disciplined shared-memory programming model allows DeNovo to seamlessly integrate message passing-like interactions within a global address space for improved design complexity, performance, and efficiency.
INDEX TERMS
CITATION
Hyojin Sung, Robert Smolinski, Byn Choi, Rakesh Komuravelli, Nicholas P. Carter, Sarita V. Adve, Nima Honarmand, Vikram S. Adve, Ching-Tsun Chou, "DeNovo: Rethinking the Memory Hierarchy for Disciplined Parallelism", Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, vol. 00, no. , pp. 155-166, 2011, doi:10.1109/PACT.2011.21
177 ms
(Ver 3.3 (11022016))