This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Performance Evaluation of a Decoded Instruction Cache for Variable Instruction Length Computers
October 1994 (vol. 43 no. 10)
pp. 1140-1150

A Decoded INstruction Cache (DINC) is a buffer between the instruction decoder and other instruction pipeline stages. In this paper, we explain how techniques that reduce the branch penalty on a DINC, can improve CPU performance. We also analyze the impact of some of the design parameters of DINC's on variable instruction length computers. Our study indicates that tuning the mapping of the instructions into the cache can improve performance substantially. Tuning must be based on the instruction length distribution for a specific architecture. In addition, the associativity degree has a greater effect on the DINC's performance than on the performance of regular caches. We discuss the difference between the performance of DINC's and other caches, when longer cache lines are used. We present a model to estimate the miss rate based on its characteristics, that are discussed and analyzed throughout this paper. Our conclusions are based on both analytical study and trace driven simulations of several integer UNIX applications.

[1] S. McFarling and I. Hennessey, "Reducing the cost of branches," inProc. 13th Annu. Symp. Comput. Architecture, June 1986, pp. 396-403.
[2] D. J. Lilja, "Reducing the branch penalty in pipelined processors,"IEEE Comput., vol. 21, no. 7, pp. 47-54, July 1988.
[3] D. R. Ditzel and H. R. McLellan, "Branch folding in CRISP microprocessor," inProc. 14th Annu. Symp. Comput. Architecture, June 1987, pp. 2-9.
[4] J. Letz and J. Slingwine, "Living with RISC: software issues in the regulus architecture," inProc. 1987 IEEE Int. Conf. Computer Design: VLSI in Comput. and Processors (ICCD-87). 1987, pp. 549-557.
[5] P. V. Argadeet al., "Hobbit: A high-performance, low-power microprocessor," inSpring COMPCON 93 Proc., 1993, pp. 88-95.
[6] D. Philips, "Z80000 microprocessor,"IEEE Micro, vol. 5, no. 6, pp. 23-36, Dec. 1985.
[7] R. W. Edenfieldet al., "The 68040 Processor, Part 1, Design and implementation,"IEEE Micro, vol. 10, no. 1, pp. 66-78, Feb. 1990.
[8] J.L. Hennessy and David A. Patterson,Computer Architecture: A Quantitative Approach, Morgan Kaufmann, San Mateo, Calif., 1990.
[9] J. Lee and A. Smith, "Branch prediction strategies and branch target buffer design,"IEEE Comput., vol. 17, no. 1, pp. 6-22. Jan. 1984.
[10] D. Alpert, A. Averbuch, and O. Danieli, Performance comparison of load/store and symmetric instruction set architectures." inProc. 17th Annu. Symp. Comput. Architecture, 1990, pp. 172-181.
[11] C. Hunter,Series 32000 Programmer's Reference Manual, Prentice Hall, 1987.
[12] A. Smith, "Cache Memories,"Computing Surveys, Vol. 14, No. 3, Sept. 1982, pp. 473- 530.
[13] P.J. Fleming and J.J. Wallace, "How Not to Lie with Statistics: The Correct Way to Summarize Benchmark Results,"Comm. ACM, Vol. 29, No. 3, Mar. 1986, pp. 218-221.
[14] D. Alpert and M. Flynn, "Performance trade-offs for microprocessor cache memories,"IEEE Micro, vol. 8, no. 4, pp. 44-54, Aug. 1988.
[15] A. J. Smith, "Line (block) size choice for CPU cache memories,"IEEE Trans. Computers, vol. 36, no. 9, pp. 1063-1074, 1987.
[16] M. D. Hill, "A case for direct-mapped caches,"IEEE Comput., vol. 21, no. 12, pp. 25-40, Dec. 1988.

Index Terms:
performance evaluation; computer architecture; buffer storage; performance evaluation; decoded instruction cache; variable instruction length computers; instruction decoder; instruction pipeline stages; instruction length distribution; trace driven simulations; UNIX applications.
Citation:
G.D. Intrater, I.Y. Spillinger, "Performance Evaluation of a Decoded Instruction Cache for Variable Instruction Length Computers," IEEE Transactions on Computers, vol. 43, no. 10, pp. 1140-1150, Oct. 1994, doi:10.1109/12.324540
Usage of this product signifies your acceptance of the Terms of Use.