Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques (2009)
Raleigh, North Carolina, USA
Sept. 12, 2009 to Sept. 16, 2009
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/PACT.2009.16
This paper presents a task-centric memory model for 1000-core compute accelerators. Visual computing applications are emerging as an important class of workloads that can exploit 1000-core processors. In these workloads, we observe data sharing and communication patterns that can be leveraged in the design of memory systems for future 1000-core processors. Based on these insights, we propose a memory model that uses a software protocol, working in collaboration with hardware caches, to maintain a coherent, single-address space view of memory without the need for hardware coherence support. We evaluate the task-centric memory model in simulation on a 1024-core MIMD accelerator we are developing that, with the help of a runtime system, implements the proposed memory model. We evaluate coherence management policies related to the task-centric memory model and show that the overhead of maintaining a coherent view of memory in software can be minimal. We further show that, while software management may constrain speculative hardware prefetching into local caches, a common optimization, it does not constrain the more relevant use case of off-chip prefetching from DRAM into shared caches.
Cache Coherence, Memory Model, Computer Architecture, Compute Accelerator
John H. Kelm, Steven S. Lumetta, Sanjay J. Patel, Daniel R. Johnson, Matthew I. Frank, "A Task-Centric Memory Model for Scalable Accelerator Architectures", Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, vol. 00, no. , pp. 77-87, 2009, doi:10.1109/PACT.2009.16