The Community for Technology Leaders
Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques (2011)
Galveston, Texas USA
Oct. 10, 2011 to Oct. 14, 2011
ISSN: 1089-795X
ISBN: 978-0-7695-4566-0
pp: 179-180
ABSTRACT
In this paper we present OUTRIDERHP, a novel implementation of a decoupled architecture that approaches the performance of contemporary out-of-order processors on parallel benchmarks while maintaining low hardware complexity. OUTRIDERHP leverages the compiler to separate a single thread of execution into memory-accessing and memory-consuming streams that can be executed concurrently, which we call strands. We identify loss-of-decoupling events which cripple performance on traditional decoupled architectures, and design OUTRIDERHP to enable extraction of multiple strands and control speculation which provide superior memory and functional unit latency tolerance. OUTRIDERHP outperforms a baseline in-order architecture by 26-220% and Decoupled Access/Execute by 7-172% when executing parallel benchmarks on an 8-core CMP configuration. OUTRIDERHP performs within 15% of higher-complexity out-of-order cores despite not utilizing large physical register files, dynamic scheduling, and register renaming hardware.
INDEX TERMS
High-Performance, Decoupled, Processor, Computer Architecture
CITATION
Neal C. Crago, Sanjay J. Patel, "Decoupled Architectures as a Low-Complexity Alternative to Out-of-order Execution", Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, vol. 00, no. , pp. 179-180, 2011, doi:10.1109/PACT.2011.28
181 ms
(Ver 3.3 (11022016))