Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques (1999)
Newport Beach, California
Oct. 12, 1999 to Oct. 16, 1999
Benjamin G. Zorn , Microsoft Corporation
Martin Burtscher , University of Colorado
Most load value predictors retain a large number of previously loaded values for making future predictions. In this paper we evaluate the trade-off between tall and slim versus short and wide predictors of the same total size, i.e., between retaining a few values for a large number of load instructions and many values for a proportionately smaller number of loads. Our results show, for example, that even modest predictors holding sixteen kilobytes of values benefit from retaining four values per load instruction when running SPECint95.A detailed comparison of eight load value predictors on a cycle-accurate simulator of a superscalar out-of-order microprocessor shows that our implementation of a last four value predictor outperforms other predictors from the literature, often significantly. With 21kB of state, it yields a harmonic mean speedup of 12.5% with existing re-fetch misprediction recovery hardware and 13.7% with a not yet realized re-execution recovery mechanism.
value prediction, processor performance, value locality, predictor design, behavior prediction
Benjamin G. Zorn, Martin Burtscher, "Exploring Last n Value Prediction", Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, vol. 00, no. , pp. 66, 1999, doi:10.1109/PACT.1999.807407