This Article 
 Bibliographic References 
 Add to: 
The Effect of Code Expanding Optimizations on Instruction Cache Design
September 1993 (vol. 42 no. 9)
pp. 1045-1057

Shows that code expanding optimizations have strong and nonintuitive implications on instruction cache design. Three types of code expanding optimizations are studied in this paper: instruction placement, function inline expansion, and superscalar optimizations. Overall, instruction placement reduces the miss ratio of small caches. Function inline expansion improves the performance for small cache sizes, but degrades the performance of medium caches. Superscalar optimizations increase the miss ratio for all cache sizes. However, they also increase the sequentiality of instruction access so that a simple load forwarding scheme effectively cancels the negative effects. Overall, the authors show that with load forwarding, the three types of code expanding optimizations jointly improve the performance of small caches and have little effect on large caches.

[1] A. V. Aho, R. Sethi, and J. D. Ullman,Compilers: Principles, Techniques, and Tools. Reading, MA: Addison-Wesley, 1986.
[2] R. M. Stallman,Using and Porting GNU CC. Free Software Foundation, Inc., 1989.
[3] P. Chang, S. Mahlke, W. Chen, N. Warter, and W. Hwu, "IMPACT: An architectural framework for multiple-instruction-issue processors," inProc. 18th Annu. Int. Symp. Comput. Architecture, Toronto, Canada, IEEE and ACM, May 1991, pp. 266-275.
[4] A. Smith, "Cache Memories,"Computing Surveys, Vol. 14, No. 3, Sept. 1982, pp. 473- 530.
[5] J. E. Smith and J. R. Goodman, "Instruction cache replacement policies and organizations,"IEEE Trans. Comput., vol. C-34, pp. 234-241, Mar. 1985.
[6] R. J. Eickenmeyer and J. H. Patel, "Performance evaluation of on-chip register and cache organizations," inProc. 15th Ann. Int. Symp. Comput. Arch., Honolulu, HI, May 1988, pp. 64-72.
[7] D. B. Alpert and M. J. Flynn, "Performance trade-offs for microprocessor cache memories,"Micro, pp. 44-54, Aug. 1988.
[8] M. D. Hill and A. J. Smith, "Experimental evaluation of on-chip microprocessor cache memories," inProc. 11th Annu. Int. Symp. Comput. Architecture, June 1984, pp. 158-166.
[9] J. Davidson and R. Vaughan, "The effect of instruction set complexity on program size and memory performance," inProc. 2nd Int. Conf. Arch. Support Prog. Lang. Operat. Syst., Palo Alto, CA, Oct. 1987, pp. 60-64.
[10] C. Mitchell and M. Flynn, "The effects of processor architecture on memory traffic,"Trans. Comput. Syst., vol. 8, no. 3, pp. 230-250, Aug. 1990.
[11] P. Steenkiste, "The impact of code density on instruction cache performance," inProc. 16th Ann. Int. Symp. Comput. Arch., Jerusalem, Israel, June 1989, pp. 252-259.
[12] K. J. Cuderman and M. J. Flynn, "The relative effects of optimization on instruction architecture performance," Tech. Rep. CSL-TR-89-398, Comput. Syst. Lab., Stanfoid Univ., Stanford, CA, Oct. 1989.
[13] T. M. Conte, "Systematic computer architecture prototyping," Ph.D. dissertation, Dep. Elec. Comput. Eng., Univ. Illinois, Urbana, 1992.
[14] W. W. Hwu and T. M. Conte, "The susceptibility of programs to context switching,"IEEE Trans. Comput., accepted for publication, 1993.
[15] R. Denzer and G. Schimak, "Visualization of an Air Quality Measurement Network," inComputer Science for Environmental Protection(Sixth Symposium Proc.). Informatik-Fachherichte 296, Springer, Munich, 1991.
[16] S. McFarling, "Program Optimization for Instruction Caches,"Symp. Architectural Support Programming Languages and Operating Systems, IEEE CS Press, Los Alamitos, CA, Order No. 1,936, 1989, pp. 183-191.
[17] K. Pettis and R. C. Hansen, "Profile guided code positioning," inProc. 1990 ACM Conf. Prog. Lang. Des. Implement., White Plains, NY, June 1990.
[18] J. A. Fisher, "Trace Scheduling: A technique for global microcode compaction,"IEEE Trans. Comput., vol. C-30, pp. 478-490, July 1981.
[19] P. P. Chang and W. W. Hwu, "Trace selection for compiling large C application programs to microcode," inProc. 21st Annu. Workshop Microprogramming and Microarchitectures, Nov. 1988, pp. 21-29.
[20] R. Allen and S. Johnson, "Compiling C for vectorization, parallelism, and inline expansion," inProc. 1988 ACM Conf. Prog. Lang. Des. Implement., Atlanta, GA, June 1988.
[21] M. Auslander and M. Hopkins, "An overvier of the PL.8 compiler," inProc. ACM SIGPLAN '82 Symp. Compiler Construct., 1982.
[22] F. Chow and J. Hennessy, "Register allocation by priority-based coloring,"SIGPLAN Not., vol. 19, no. 6, pp. 222-232, 1984.
[23] P. P. Chang, S. A. Mahlke, W. Y. Chen, and W. W. Hwu, "Profile-guided automatic inline expansion for c programs,"Software Practice Experience, May 1992.
[24] S. A. Mahlke, W. Y. Chen, J. C. Gyllenhaal, W. W. Hwu, P. P. Chang, and T. Kiyohara, "Compiler code transformations for superscalar-based high-performance systems," inProc. Supercomput. '92, Nov. 1992.
[25] P. P. Chang, S. A. Mahlke, and W. W. Hwu, "Using profile information to assist classic code optimizations,"Software Practice Experience, Dec. 1991.

Index Terms:
code expanding optimizations; instruction cache; cache design; instruction placement; function inline expansion; superscalar optimizations; miss ratio; small caches; medium caches; load forwarding; large caches; C compiler; code optimization; cache memory; code expansion; buffer storage; memory architecture; optimisation.
W.Y. Chen, P.P. Chang, T.M. Conte, W.W. Hwu, "The Effect of Code Expanding Optimizations on Instruction Cache Design," IEEE Transactions on Computers, vol. 42, no. 9, pp. 1045-1057, Sept. 1993, doi:10.1109/12.241594
Usage of this product signifies your acceptance of the Terms of Use.