The Community for Technology Leaders
RSS Icon
Issue No.07 - July (2008 vol.19)
pp: 914-925
This paper presents the Mitosis framework, which is a combined hardware-software approach to speculative multithreading, even in the presence of frequent dependences among threads. Speculative multithreading increases single-threaded application performance by exploiting thread-level parallelism speculatively - that is, executing code in parallel even when the compiler or runtime system cannot guarantee the parallelism exists. The proposed approach is based on predicting/computing thread input values via software, through a piece of code that is added at the beginning of each thread (the pre-computation slice). A pre-computation slice is expected to compute the correct thread input values most of the time, but not necessarily always. This allows aggressive optimization techniques to be applied to the slice to make it very short. This paper focuses on the microarchitecture that supports this execution model. The primary novelty of the microarchitecture is the hardware support for the execution and validation of pre-computation slices. Additionally, this paper presents new architectures for the register file and the cache memory in order to support multiple versions of each variable and allow for efficient roll-back in case of misspeculation. We show that the proposed microarchitecture, together with the compiler support, achieves an average speedup of 2.2 for applications that conventional non-speculative approaches are not able to parallelize at all.
Speculative thread level parallelism, pre-computation slices, thread partitioning, multi-core architecture.
Carlos Madriles, Carlos García-Quiñones, Jesús Sánchez, Pedro Marcuello, Antonio González, Dean M. Tullsen, Hong Wang, John P. Shen, "Mitosis: A Speculative Multithreaded Processor Based on Precomputation Slices", IEEE Transactions on Parallel & Distributed Systems, vol.19, no. 7, pp. 914-925, July 2008, doi:10.1109/TPDS.2007.70797
[1] H. Akkary and M.A. Driscoll, “A Dynamic Multithreading Processor,” Proc. 31st IEEE/ACM Int'l Symp. Microarchitecture (MICRO), 1998.
[2] S. Breach, T.N. Vijaykumar, and G.S. Sohi, “The Anatomy of the Register File in a Multiscalar Processor,” Proc. 25th IEEE/ACM Int'lSymp. Microarchitecture (MICRO '94), pp. 181-190, 1994.
[3] M. Cintra, J.F. Martinez, and J. Torrellas, “Architectural Support for Scalable Speculative Parallelization in Shared-Memory Systems,” Proc. 27th IEEE/ACM Int'l Symp. Microarchitecture (MICRO), 2000.
[4] M. Cintra and J. Torrellas, “Eliminating Squashes through Learning Cross-Thread Violations in Speculative Parallelization for Multiprocessors,” Proc. Eighth Int'l Symp. High-Performance Computer Architecture (HPCA), 2002.
[5] R.S. Chappel, J. Stark, S.P. Kim, S.K. Reinhardt, and Y.N. Patt, “Simultaneous Subordinate Microthreading (SSMT),” Proc. 26thInt'l Symp. Computer Architecture (ISCA '99), pp. 186-195, 1999.
[6] L. Codrescu and D. Wills, “On Dynamic Speculative Thread Partitioning and the MEM-Slicing Algorithm,” Proc. Int'l Conf.Parallel Architectures and Compilation Techniques (PACT '99), pp. 40-46, 1999.
[7] J.D. Collins, H. Wang, D.M. Tullsen, C. Hughes, Y.-F. Lee, D. Lavery, and J.P. Shen, “Speculative Precomputation: Long Range Prefetching of Delinquent Loads,” Proc. 28th Int'l Symp. Computer Architecture (ISCA), 2001.
[8] K. Diekendorff, Compaq Chooses SMT for Alpha, microprocessor report, Dec. 1999.
[9] M. Franklin and G.S. Sohi, “The Expandable Split Window Paradigm for Exploiting Fine-Grain Parallelism,” Proc. 19th Int'l Symp. Computer Architecture (ISCA '92), pp. 58-67, 1992.
[10] C. Garcia, C. Madriles, J. Sanchez, P. Marcuello, A. Gonzalez, and D.M. Tullsen, “Mitosis Compiler: n Infrastructure for Speculative Threading Based on Pre-Computation Slices,” Proc. ACM Conf. Programming Language Design and Implementation (PLDI '05), June 2005.
[11] S. Gopal, T.N. Vijaykumar, J.E. Smith, and G.S. Sohi, “Speculative Versioning Cache,” Proc. Fourth Int'l Symp. High-Performance Computer Architecture (HPCA), 1998.
[12] L. Hammond, M. Willey, and K. Olukotun, “Data Speculation Support for a Chip Multiprocessor,” Proc. Eighth Int'l Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS), 1998.
[13], 2007.
[14] V. Krishnan and J. Torrellas, “Hardware and Software Support forSpeculative Execution of Sequential Binaries on a Chip-Multiprocessor,” Proc. Int'l Conf. Supercomputing (ICS '98), pp.85-92, 1998.
[15] P. Marcuello, “Speculative Multithreaded Processors,” PhD dissertation, Universitat Politecnica de Catalunya, 2003.
[16] P. Marcuello, J. Tubella, and A. González, “Value Prediction for Speculative Multithreaded Architectures,” Proc. 32nd Int'l Conf. Microarchitecture (MICRO '99), pp. 203-236, 1999.
[17] P. Marcuello and A. González, “Thread-Spawning Schemes forSpeculative Multithreaded Architectures,” Proc. Eighth Int'l Symp. High-Performance Computer Architecture (HPCA), 2002.
[18] T. Marr et al., “Hyperthreading Technology Architecture and Microarchitecture,” Intel Technology J., vol. 6, no. 1, 2002.
[19] J. Oplinger et al., “Software and Hardware for Exploiting Speculative Parallelism in Multiprocessors,” Technical Report CSL-TR-97-715, Stanford Univ., 1997.
[20] A. Mendelson et al., “CMP Implementation in the Intel Core Duo Processor,” Intel Technology J., vol. 10, no. 2, 2006.
[21] T. Ohsawa, M. Takagi, S. Kawahara, and S. Matsushita, “Pinot: Speculative Muti-threading Processor Architecture Exploiting Parallelism over a wide Range of Granularities,” Proc. 38th Int'l Symp. Microarchitecture (MICRO), 2005.
[22] M. Prvulovic, M.J. Garzarán, L. Rauchwerger, and J. Torrellas, “Removing Architectural Bottlenecks to the Scalability of Speculative Parallelization,” Proc. 28th Int'l Symp. Computer Architecture (ISCA), 2001.
[23] J. Renau, J. Tuck, W. Liu, L. Ceze, K. Strauss, and J. Torrellas, “Tasking with Out-of-Order Spawn in TLS Chip Multiprocessors: Microarchitecture and Compilation,” Proc. 19th ACM Int'l Conf. Supercomputing (ICS), 2005.
[24] A. Roth and G.S. Sohi, “Speculative Data-Driven Multithreading,” Proc. Seventh Int'l Symp. High-Performance Computer Architecture (HPCA '01), pp. 37-48, 2001.
[25] S.R. Sarangi, W. Liu, J. Torrellas, and Y. Zhou, “ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing,” Proc. 38th Int'l Symp. Microarchitecture (MICRO), 2005.
[26] G.S. Sohi, S.E. Breach, and T.N. Vijaykumar, “Multiscalar Processors,” Proc. 22nd Int'l Symp. Computer Architecture (ISCA '95), pp.414-425, 1995.
[27] J. Steffan and T. Mowry, “The Potential of Using Thread-Level Data Speculation to Facilitate Automatic Parallelization,” Proc.Fourth Int'l Symp. High-Performance Computer Architecture (HPCA '98), pp. 2-13, 1998.
[28] J. Steffan, C. Colohan, A. Zhai, and T. Mowry, “Improving ValueCommunication for Thread-Level Speculation,” Proc. EighthInt'lSymp. High-Performance Computer Architecture (HPCA'98), pp. 58-62, 1998.
[29] S. Storino and D.J. Borkenhagen, “A Multithreaded 64-bit PowerPC Commercial RISC Processor Design,” Proc. 11th Int'l Conf. High-Performance Chips, 1999.
[30] J.Y. Tsai and P.-C. Yew, “The Superthreaded Architecture: Thread Pipelining with Run-Time Data Dependence Checking and Control Speculation,” Proc. Int'l Conf. Parallel Architectures and Compilation Techniques (PACT), 1995.
[31] D.M. Tullsen, S.J. Eggers, and H.M. Levy, “Simultaneous Multithreading: Maximizing On-Chip Parallelism,” Proc. 22nd Int'l Symp. Computer Architecture (ISCA '95), pp. 392-403, 1995.
[32] T.N. Vijaykumar, “Compiling for the Multiscalar Architecture,” PhD dissertation, Univ. of Wisconsin, Madison, 1998.
[33] F. Warg and P. Stenström, “Limits on Speculative Module-Level Parallelism in Imperative and Object-Oriented Programs on CMPPlatforms,” Proc. Int'l Conf. Parallel Architectures and Compilation Techniques (PACT), 2001.
[34] C.B. Zilles and G.S. Sohi, “Execution-Based Prediction Using Speculative Slices,” Proc. 28th Int'l Symp. Computer Architecture (ISCA), 2001.
[35] C.B. Zilles and G.S. Sohi, “Master/Slave Speculative Parallelization,” Proc. 35th Int'l Symp. Microarchitecture (MICRO), 2002.
21 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool