loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Thread Partitioning and Value Prediction for Exploiting Speculative Thread-Level Parallelism
February 2004 (vol. 53 no. 2)
pp. 114-125

Abstract—Speculative thread-level parallelism has been recently proposed as a source of parallelism to improve the performance in applications where parallel threads are hard to find. However, the efficiency of this execution model strongly depends on the performance of the control and data speculation techniques. In this work, several hardware-based schemes for partitioning the program into speculative threads are analyzed and evaluated. In general, we find that spawning threads associated to loop iterations is the most effective technique. We also show that value prediction is critical for the performance of all of the spawning policies. Thus, a new value predictor, the increment predictor, is proposed. This predictor is specially oriented for this kind of architecture and clearly outperforms the adapted versions of conventional value predictors such as the last value, the stride, and the context-based, especially for small-sized history tables.

[1] H. Akkary and M.A. Driscoll, A Dynamic Multithreading Processor Proc. 31st. Ann. Int'l Symp. Microarchitecture, 1998.
[2] B. Calder, G. Reinman, and D. Tullsen, Selective Value Prediction Proc. 26th Int'l Symp. Computer Architecture, 1999.
[3] M. Cintra and J. Torrellas, Eliminating Squashes through Learning Cross-Thread Violations in Speculative Parallelization for Multiprocessors Proc. Eighth Int'l Symp. High-Performance Computer Architecture, pp. 36-47, 2002.
[4] L. Codrescu and D. Wills, On Dynamic Speculative Thread Partitioning and the MEM-Slicing Algorithm Proc. Int'l Conf. Parallel Architectures and Compilation Techniques, pp. 40-46, 1999.
[5] P.K. Dubey, K. O'Brien, K.M. O'Brien, and C. Barton, Single-Program Speculative Multithreading (SPSM) Architecture: Compiler-Assisted Fine-Grained Multithreading Proc. Int'l Conf. Parallel Architectures and Compilation Techniques, pp. 109-121, 1995.
[6] M. Franklin and G.S. Sohi,"The Expandable Split Window Paradigm for Exploiting Fine-Grain Parallelism," Proc. 19th Ann. Int'l Symp. Computer Architecture, pp. 58-67, 1992.
[7] F. Gabbay and A. Mendelson, Speculative Execution Based on Value Prediction Technical Report #1080, Technion, 1996.
[8] J. González and A. González, Memory Address Prediction for Data Speculation Technical Report UPC-DAC-1996-51, Universitat Politècnica de Catalunya, 1996.
[9] S. Gopal, T.N. Vijaykumar, J.E. Smith, and G.S. Sohi, Speculative Versioning Cache Proc. Fourth Int'l Symp. High-Performance Computer Architecture, 1998.
[10] L. Hammond, M. Willey, and K. Olukotun, Data Speculation Support for a Chip Multiprocessor Proc. Int'l Conf. Architectural Support for Programming Languages and Operating Systems, 1998.
[11] G.A. Kemp and M. Franklin, PEWs: A Decentralized Dynamic Scheduler for ILP Processing Proc. Int'l Conf. Parallel Processing, pp. 239-246, 1996.
[12] V. Krishnan and J. Torrellas, Hardware and Software Support for Speculative Execution of Sequential Binaries on a Chip-Multiprocessor Proc. ACM Int'l Conf. Supercomputing, pp. 85-92, 1998.
[13] M.H. Lipasti, C.B. Wilkerson, and J.P. Shen, Value Locality and Load Value Prediction Proc. Seventh Conf. Architectural Support for Programming Languages and Operating Systems, pp. 138-147, Oct. 1996.
[14] P. Marcuello, A. González, and J. Tubella, Speculative Multithreaded Processors Proc. 12th Int'l Conf. Supercomputing, pp. 77-84, 1998.
[15] P. Marcuello and A. González, Clustered Speculative Multithreaded Processors Proc. 13th Int'l Conf. Supercomputing, pp. 365-372, 1999.
[16] P. Marcuello, J. Tubella, and A. González, Value Prediction for Speculative Multithreaded Architectures Proc. 32nd Int'l Conf. Microarchitecture, pp. 230-236, 1999.
[17] P. Marcuello and A. González, Thread Spawning Schemes for Speculative Multithreaded Architectures Proc. Eighth Int'l Conf. High Performance Computing Architecture, 2002.
[18] T. Nakra, R. Gupta, and M.L. Soffa, Global Context-Based Value Prediction Proc. Fifth Int'l Conf. High Performance Computing Architecture, pp. 4-12, 1999.
[19] J. Oplinger, D. Heine, and M. Lam, In Search of Speculative Thread-Level Parallelism Proc. Int'l Conf. Parallel Architectures and Compilation Techniques, pp. 303-313, 1999.
[20] E. Rotenberg, Q. Jacobson, Y. Sazeides, and J.E. Smith, Trace Processors Proc. 30th Int'l Symp. Microarchitecture, pp. 138-148, 1997.
[21] E. Rotenberg, S. Bennett, and J. Smith, "Trace Cache: A Low Latency Approach to High Bandwidth Instruction Fetching," Proc. 29th Ann. ACM/IEEE Int'l Symp. on Microarchitecture, IEEE CS Press, Los Alamitos, Calif., 1996, pp. 24-34.
[22] Y. Sazeides, S. Vassiliadis, and J.E. Smith, The Performance Potential of Data Dependence Speculation&Collapsing Proc. 29th Int'l Symp. Microarchitecture, Dec. 1996.
[23] Y. Sazeides and J.E. Smith, Implementations of Context-Based Value Predictors Technical Report #ECE-TR-97-8, Univ. of Wisconsin-Madison, 1997.
[24] G. Sohi, S.E. Breach, and T.N. Vijaykumar, Multiscalar Processors Proc. Int'l Symp. Computer Architecture, pp. 414-425, 1995.
[25] A. Srivastava and A. Eustace, ATOM: A System for Building Customized Program Analysis Tools Proc. Int'l Conf. Programming Panguages Design and Implementation, 1994.
[26] J. Steffan, C. Colohan, A. Zhai, and T. Mowry, Improving Value Communication for Thread-Level Speculation Proc. Eighth Int'l Symp. High-Performance Computer Architecture, pp. 58-68, 2002.
[27] J. Steffan and T. Mowry, The Potential of Using Thread-Level Data Speculation to Facilitate Automatic Parallelization Proc. Fourth Int'l Symp. High-Performance Computer Architecture, pp. 2-13, 1998.
[28] D.M. Tullsen and P.J. Brown, Handling Long-Latency Loads in a Simultaneous Multithreading Processor Proc. 34th Int'l Symp. Microarchitecture, pp. 318-327, 2001.
[29] D.M. Tullsen, S.J. Eggers, and H.M. Levy, Simultaneous Multithreading: Maximizing On-Chip Parallelism Proc. Int'l Symp. Computer Architecture, pp. 392-403, 1995.
[30] J.-Y. Tsai and P.-C. Yew, "The Superthreaded Architecture: Thread Pipelining with Run-time Data Dependence Checking and Control Speculation," Proc. Int'l Conf. Parallel Architectures and Compilation Techniques, IEEE Computer Society Press, Los Alamitos, Calif., 1996, pp. 49-58.
[31] S. Vajapeyam and T. Mitra, Improving Superscalar Instruction Dispatch and Issue by Exploiting Dynamic Code Sequences Proc. 24th Int'l Symp. Computer Architecture, pp. 1-12, 1997.
[32] T.N. Vijaykumar, Compiling for the Multiscalar Architecture PhD thesis, Univ. of Wisconsin-Madison, 1998.
[33] K. Wang and M. Franklin, Highly Accurate Data Value Prediction Using Hybrid Predictors Proc. 30th Int'l Symp. Microarchitecture, 1997.
[34] F. Warg and P. Stenstrom, Limits on Speculative Module-Level Parallelism in Imperative and Object-Oriented Programs on CMP Platforms Proc. Int'l Conf. Parallel Architectures and Compilation Techniques, pp. 221-230, 2001.

Index Terms:
Speculative thread-level parallelism, value prediction, branch prediction, thread spawning policies, clustered architectures.
Citation:
Pedro Marcuello, Antonio Gonz?lez, Jordi Tubella, "Thread Partitioning and Value Prediction for Exploiting Speculative Thread-Level Parallelism," IEEE Transactions on Computers, vol. 53, no. 2, pp. 114-125, Feb. 2004, doi:10.1109/TC.2004.1261823
Usage of this product signifies your acceptance of the Terms of Use.