Proceedings Fifth International Symposium on High-Performance Computer Architecture (1999)
Jan. 9, 1999 to Jan. 12, 1999
Joan-Manuel Parcerisa , Universitat Polit?cnica de Catalunya - Barcelona
Antonio González , Universitat Polit?cnica de Catalunya - Barcelona
This work presents and evaluates a novel processor microarchitecture which combines two paradigms: access/execute decoupling and simultaneous multithreading. We investigate how both techniques complement each other: while decoupling features an excellent memory latency hiding efficiency, multithreading supplies the in-order issue stage with enough ILP to hide the functional unit latencies. Its partitioned layout, together with its in-order issue policy makes it potentially less complex, in terms of critical path delays, than a centralized out-of-order design, to support future growths in issue-width and clock speed.The simulations show that by adding decoupling to a multithreaded architecture, its miss latency tolerance is sharply increased and in addition, it needs fewer threads to achieve maximum throughput, especially for a large miss latency. Fewer threads result in a hardware complexity reduction and lower demands on the memory system, which becomes a critical resource for large miss latencies, since bandwidth may become a bottleneck.
J. Parcerisa and A. González, "The Synergy of Multithreading and Access/Execute Decoupling," Proceedings Fifth International Symposium on High-Performance Computer Architecture(HPCA), Orlando, Florida, 1999, pp. 59.