This Article 
 Bibliographic References 
 Add to: 
ARB: A Hardware Mechanism for Dynamic Reordering of Memory References
May 1996 (vol. 45 no. 5)
pp. 552-571

Abstract—To exploit instruction level parallelism, it is important not only to execute multiple memory references per cycle, but also to reorder memory references(especially to execute loads before stores that precede them in the sequential instruction stream. To guarantee correctness of execution in such situations, memory reference addresses have to be disambiguated. This paper presents a novel hardware mechanism, called an Address Resolution Buffer (ARB), for performing dynamic reordering of memory references. The ARB supports the following features: 1) dynamic memory disambiguation in a decentralized manner, 2) multiple memory references per cycle, 3) out-of-order execution of memory references, 4) unresolved loads and stores, 5) speculative loads and stores, and 6) memory renaming. The paper presents the results of a simulation study that we conducted to verify the efficacy of the ARB for a superscalar processor. The paper also shows the ARB's application in a multiscalar processor.

[1] D.W. Anderson,F.J. Sparacio, and R.M. Tomasulo,"The IBM System/360 Model Machine Philosophy and Instruction-Handling," IBM J. Research and Development, pp. 8-24, Jan. 1967.
[2] T.M. Austin and G.S. Sohi,"Dynamic Dependency Analysis of Ordinary Programs," Proc. 19th Ann. Int'l Symp. Computer Architecture, pp. 342-351, 1992.
[3] L.J. Boland,G.D. Granito,A.U. Marcotte,B.U. Messina, and J.W. Smith,"The IBM System/360 Model 91: Storage System," IBM J., pp. 54-68, Jan. 1967.
[4] J.R. Ellis, Bulldog: A Compiler for VLIW Architectures.Cambridge, Mass.: MIT Press, 1986.
[5] M. Franklin and G.S. Sohi,"The Expandable Split Window Paradigm for Exploiting Fine-Grain Parallelism," Proc. 19th Ann. Int'l Symp. Computer Architecture, pp. 58-67, 1992.
[6] M. Franklin,"The Multiscalar Architecture," PhD thesis, Computer Sciences Dept., Univ. of Wisconsin—Madison, 1993. Also Technical Report TR 1196, Computer Sciences Dept., Univ. of Wisconsin—Madison, 1993.
[7] D.M. Gallagher,W.Y. Chen,S.A. Mahlke,J.C. Gyllenhaal,W.W. Hwu,B. Heggy, and M.L. Soffa,"Architectural Support for Register Allocation in the Presence of Aliasing," Proc. Supercomputing '90, pp. 730-739, Nov. 1990.
[8] W.W. Hwu,"Exploiting Concurrency to Achieve High Performance in, Single-Chip Microarchitecture," PhD thesis, Report No. UCB/CSD 88/398, Dept. of Electrical Eng. and Computer Sciences, Univ. of California, Berkeley, 1988.
[9] M. Johnson,Superscalar Design.Englewood Cliffs, N.J.: Prentice Hall, 1990.
[10] G. Kane,MIPS R2000 RISC Architecture.Englewood Cliffs, N.J.: Prentice Hall, 1987.
[11] A. Nicolau,"Run-Time Disambiguation: Coping With Statically Unpredictable Dependencies," IEEE Trans. Computers, vol. 38, no. 5, pp. 663-678, May 1989.
[12] Y.N. Patt,S.W. Melvin,W.W. Hwu, and M. Shebanow,"Criti-cal Issues Regarding HPS, High Performance Microarchitecture," Proc. 18th Ann. Workshop Microprogramming, pp. 109-116,Pacific Grove, Calif., Dec. 1985.
[13] D.N. Pnevmatikatos,M. Franklin, and G.S. Sohi,"Control Flow Prediction for Dynamic ILP Processors," Proc. 26th Ann. Int'l Symp. Microarchitecture (MICRO 26), pp. 153-163, 1993.
[14] G.M. Silberman and K. Ebcioglu,"An Architectural Framework for Migration from CISC to Higher Performance Platforms," Proc. Int'l Conf. Supercomputing, pp. 198-215, 1992.
[15] J.E. Smith and A.R. Pleszkun,"Implementing Precise Interrupts in Pipelined Processors," IEEE Trans. Computers, vol. 37, no. 5, pp. 562-573, May 1988.
[16] J.E. Smith,"Dynamic Instruction Scheduling and the Astronautics ZS-1," Computer, pp. 21-35, July 1989.
[17] G.S. Sohi, "Instruction Issue Logic for High-Performance, Interruptible, Multiple Functional Unit, Pipelined Computers," IEEE Trans. Computers, Vol. 39, No. 3, 1990, pp. 349-359.
[18] G.S. Sohi,S.E. Breach, and T.N. Vijaykumar,"Multiscalar Processors," Proc. 22nd Ann. Int'l Symp. Computer Architecture, 1995.
[19] T.Y. Yeh and Y.N. Patt,"Alternative Implementations of Two-Level Adaptive Training Branch Prediction," Proc. 19th Ann. Int'l Symp. Computer Architecture, pp. 124-134, 1992.

Index Terms:
Address Resolution Buffer (ARB), dynamic scheduling, memory address disambiguation, speculative execution, unresolved memory references.
Manoj Franklin, Gurindar S. Sohi, "ARB: A Hardware Mechanism for Dynamic Reordering of Memory References," IEEE Transactions on Computers, vol. 45, no. 5, pp. 552-571, May 1996, doi:10.1109/12.509907
Usage of this product signifies your acceptance of the Terms of Use.