2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture (2010)
Dec. 4, 2010 to Dec. 8, 2010
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/MICRO.2010.41
Parallelism is the key to continued performance scaling in modern microprocessors. Yet we observe that this parallelism can often contain a surprising amount of instruction redundancy. We propose to exploit this redundancy to improve performance and decrease energy consumption. We propose a multi-threading micro-architecture, Minimal Multi-Threading (MMT), that leverages register renaming and the instruction window to combine the fetch and execution of identical instructions between threads in SPMD applications. While many techniques exploit intra-thread similarities by detecting when a later instruction may use an earlier result, MMT exploits inter-thread similarities by, whenever possible, fetching instructions from different threads together and only splitting them if the computation is unique. With two threads, our design achieves a speedup of 1.15 (geometric mean) over a two-thread traditional SMT with a trace cache. With four threads, our design achieves a speedup of 1.25 (geometric mean) over a traditional SMT processor with four-threads and a trace cache. These correspond to speedups of 1.5 and 1.84 over a traditional out-of-order processor. Moreover, our performance increases in most applications with no power increase because the increase in overhead is countered with a decrease in cache accesses, leading to a decrease in energy consumption for all applications.
multi-threading, parallel architectures, program processors, redundancy
G. Long et al., "Minimal Multi-threading: Finding and Removing Redundant Instructions in Multi-threaded Processors," 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture(MICRO), Atlanta, Georgia USA, 2011, pp. 337-348.