This Article 
 Bibliographic References 
 Add to: 
Hybrid Load-Value Predictors
July 2002 (vol. 51 no. 7)
pp. 759-774

Load instructions diminish processor performance in two ways. First, due to the continuously widening gap between CPU and memory speed, the relative latency of load instructions grows constantly and already slows program execution. Second, memory reads limit the available instruction-level parallelism because instructions that use the result of a load must wait for the memory access to complete before they can start executing. Load-value predictors alleviate both problems by allowing the CPU to speculatively continue processing without having to wait for load instructions, which can significantly improve the execution speed. While several hybrid load-value predictors have been proposed and found to work well, no systematic study of such predictors exists. In this paper, we investigate the performance of all hybrids that can be built out of a register value, a last value, a stride 2-delta, a last four value, and a finite context method predictor. Our analysis shows that hybrids can deliver 25 percent more speedup than the best single-component predictors. An examination of the individual components of hybrids revealed that predictors with a poor standalone performance sometimes make excellent components in a hybrid, while combining well-performing individual predictors often does not result in an effective hybrid. Our hybridization study identified the register value + stride 2-delta predictor as one of the best two-component hybrids. It matches or exceeds the speedup of two-component hybrids from the literature in spite of its substantially smaller and simpler design. Of all the predictors we studied, the register value + stride 2-delta + last four value hybrid performs best. It yields a harmonic-mean speedup over the eight SPECint95 programs of 17.2 percent.

[1] M. Bekerman, S. Jourdan, R. Ronen, G. Kirshenboim, L. Rappoport, A. Yoaz, and U. Weiser, “Correlated Load-Address Predictors,” Proc. 26th Int'l Symp. Computer Architecture, May 1999.
[2] M. Burtscher, A. Diwan, and M. Hauswirth, “Static Load Classification for Improving the Value Predictability of Data-Cache Misses,” to appear in ACM SIGPLAN Conf. Programming Language Design and Implementation, June 2002.
[3] M. Burtscher, “Improving Context-Based Load Value Prediction,” PhD dissertation, Dept. Computer Science, Univ. of Colorado at Boulder, Apr. 2000.
[4] M. Burtscher and B.G. Zorn, “Hybridizing and Coalescing Load-Value Predictors,” Proc. Int'l Conf. Computer Design, pp. 81-92, Sept. 2000.
[5] M. Burtscher and B.G. Zorn, “Load Value Prediction Using Prediction Outcome Histories,” Technical Report CU-CS-873-98, Univ. of Colorado at Boulder, Oct. 1998.
[6] M. Burtscher and B.G. Zorn, “Exploring Last$\big. n\bigr.$Value Prediction,” Proc. 1999 Int'l Conf. Parallel Architectures and Compilation Techniques, pp. 66-76, Oct. 1999.
[7] B. Calder, P. Feller, and A. Eustace, “Value Profiling,” Proc. 30th Ann. ACM/IEEE Int'l Symp. Microarchitecture, Dec. 1997.
[8] B. Calder, G. Reinman, and D. Tullsen, Selective Value Prediction Proc. 26th Int'l Symp. Computer Architecture, 1999.
[9] Alpha Architecture Handbook. Digital Equipment Corp., 1992.
[10] F. Gabbay, “Speculative Execution Based on Value Prediction,” Technical Report #1080, EE Dept., Technion-Israel Inst. of Tech nology, Nov. 1996.
[11] F. Gabbay and A. Mendelson, “The Effect of Instruction Fetch Bandwidth on Value Prediction,” Proc. 25th Int'l Symp. ComputerArchitecture (ISCA-25), pp. 272-281, 1998.
[12] J. González and A. González, “Speculative Execution via Address Prediction and Data Prefetching,” Proc. Int'l Conf. Supercomputing, pp. 196-203, 1997.
[13] B. Goeman, H. Vandierendonck, and K. de Bosschere, Differential FCM: Increasing Value Prediction Accuracy by Improving Table Usage Efficiency Proc. Int'l Symp. High-Performance Computer Architecture, pp. 207-216, Jan. 2001.
[14] R.E. Kessler, E.J. McLellan, and D.A. Webb, The Alpha 21264 Microprocessor Architecture Proc. 1998 Int'l Conf. Computer Design, pp. 90-95, Oct. 1998.
[15] S.-J. Lee and P.-C. Yew, “On Table Bandwidth and Its Update Delay for Value Prediction on Wide-Issue ILP Processors,” IEEE Trans. Computers, vol. 50, no. 8, pp. 847-852, Aug. 2001.
[16] M.H. Lipasti and J.P. Shen, "Exceeding the Data-Flow Limit Via Value Prediction," Proc. 29th Ann. ACM/IEEE Int'l Symp. on Microarchitecture, IEEE CS Press, Los Alamitos, Calif., 1996, pp. 226-237.
[17] M.H. Lipasti, C.B. Wilkerson, and J.P. Shen, "Value Locality and Load Value Prediction," Proc. Seventh Int'l Conf. on Architectural Support for Programming Languages and Operating Systems, ACM Press, New York, 1996, pp. 138-147.
[18] S.-J. Lee, Y. Wang, and P.-C. Yew, “Decoupled Value Prediction on Trace Processors,” Proc. Sixth Int'l Symp. High Performance Computer Architecture, Jan. 2000.
[19] A. Paithankar, “AINT: A Tool for Simulation of Shared-Memory Multiprocessors,” master's thesis, Univ. of Colorado at Boulder, 1996.
[20] L. Pinuel, R.A. Moreno, and F. Tirado, “Implementation of Hybrid Context Based Value Predictors Using Value Sequence Classification,” Proc. Euro-Par, Aug. 1999.
[21] G. Reinman and B. Calder, “Predictive Techniques for Aggressive Load Speculation,” Proc. 31st Int'l Symp. Microarchitecture, Dec. 1998.
[22] B. Rychlik, J.W. Faistl, B.P. Krug, A.Y. Kurland, J.J. Sung, M.N. Velev, and J.P. Shen, “Efficient and Accurate Value Prediction Using Dynamic Classification,” Technical Report CMµART-1998-01, Carnegie Mellon Univ., 1998.
[23] B. Rychlik, J. Faistl, B. Krug, and J. Shen, “Efficacy and Performance Impact of Value Prediction,” Parallel Architectures and Compilation Techniques, Oct. 1998.
[24] Y. Sazeides and J. Smith, “The Predictability of Data Values,” Proc. 30th Ann. Int'l Symp. Microarchitecture (MICRO '30), pp. 248-258, Dec. 1997.
[25] Y. Sazeides and J.E. Smith, “Implementations of Context Based Value Predictors,” Technical Report ECE-97-8, Univ. of Wisconsin-Madison, Dec. 1997.
[26] SPEC CPU'95, Aug. 1995.
[27] D. Tullsen and J. Seng, “Storageless Value Prediction Using Prior Register Values,” Proc. 26th Int'l Symp. Computer Architecture, May 1999.
[28] K. Wang and M. Franklin, Highly Accurate Data Value Prediction Using Hybrid Predictors Proc. 30th Int'l Symp. Microarchitecture, 1997.
[29] T.-Y. Yeh and Y. Patt, “A Comparison of Dynamic Branch Predictors that Use Two Levels of Branch History,” Proc. 20th Ann. Int'l Symp. Computer Architecture, pp. 257-266, May 1993.

Index Terms:
Value prediction, value locality, load-value predictor, hybrid predictor, performance metrics.
Martin Burtscher, Benjamin G. Zorn, "Hybrid Load-Value Predictors," IEEE Transactions on Computers, vol. 51, no. 7, pp. 759-774, July 2002, doi:10.1109/TC.2002.1017696
Usage of this product signifies your acceptance of the Terms of Use.