This Article 
 Bibliographic References 
 Add to: 
The Nonuniform Distribution of Instruction-Level and Machine Parallelism and its Effect on Performance
December 1989 (vol. 38 no. 12)
pp. 1645-1658
A methodology for quickly estimating machine performance is developed. A first-order estimate is based on the average degree of machine parallelism. A second-order model corrects for the effects of nonuniformities in instruction-level and machine parallelism and is shown to be accurate to within 15% for three widely different machine pipelines: the CRAY-1, the MultiTitan, and a dual-issue super

[1] R. D. Acosta, J. Kjelstrup, and H. C. Torng, "An instruction issuing approach to enhancing performance in multiple functional unit processors,"IEEE Trans. Comput., vol. C-35, pp. 815-828, Sept. 1986.
[2] C. C. Foster and E. M. Riseman, "Percolation of code to enhance parallel dispatching and execution,"IEEE Trans. Comput., vol. C- 21, pp. 1411-1415, Dec. 1972.
[3] T. Gross, "Code optimization of pipeline constraints," Tech. Rep. 83-255, Stanford Univ., Comput. Syst. Lab., Dec. 1983.
[4] J. L. Hennessy, N. P. Jouppi, S. Przybylski, C. Rowen, and T. Gross, "Design of a high performance VLSI processor," inProc. Third Caltech Conf. VLSI, Computer Science Press, Mar. 1983, pp. 33-54.
[5] N. P. Jouppi, J. Dion, D. Boggs, and M. J. K. Nielsen, "MultiTitan: Four architecture papers," Tech. Rep. 87/8, Digital Equipment Corp. Western Res. Lab, Apr. 1988.
[6] N.P. Jouppi and D.W. Wall, "Available Instruction-Level Parallelism for Superpipelined and Superscalar Machines,"Third Int'l Conf. Architectural Support for Programming Languages and Operating Systems, IEEE CS Press, Los Alamitos, Calif., Order No. 1936, 1989, pp. 272-282.
[7] M. G. H. Katevenis, "Reduced instruction set architectures for VLSI," Tech. Rep. UCB/CSD 83/141, Univ. of California, Berkeley, Comput. Sci. Division of EECS, Oct. 1983.
[8] L. Kohn and S.-W. Fu, "A 1,000,000 Transistor Microprocessor,"Digest of Tech. Papers Int'l Solid State Circuits Conf., 1989, pp. 54-55.
[9] A. Nicolau and J. A. Fisher, "Measuring the parallelism available for very long instruction word architectures,"IEEE Trans. Comput., vol. C-33, pp. 968-976, Nov. 1984.
[10] M. J. K. Nielsen, "Titan system manual," Tech. Rep. 86/1, Digital Equipment Corp. Western Res. Lab, Sept. 1986.
[11] A. R. Pleszkun and G. S. Sohi, "The performance potential of multiple functional unit processors," inProc. 15th Annu. Int. Symp. Comput. Architecture, May 1988, pp. 37-44.
[12] E. M. Riseman and C. C. Foster, "The inhibition of potential parallelism by conditional jumps,"IEEE Trans. Comput., vol. C-21, pp. 1405-1411, Dec. 1972.
[13] G. S. Sohi and S. Vajapeyam, "Instruction issue logic in high-performance interruptible pipelined processors," inProc. 14th Annu. Symp. on Computer Architecture, June 1987, pp. 27-34.
[14] G. S. Tjaden and M. J. Flynn, "Detection and parallel execution of independent instructions,"IEEE Trans. Comput., vol. C-19, pp. 889-895, Oct. 1970.
[15] D. Wall, "Global register allocation at link time," inProc. SIGPLAN'86 Symp. Compiler Construction, ACM, June 1986, pp. 264-275.
[16] D.W. Wall and M.L. Powell, "The Mahler Experience: Using an Intermediate Language as the Machine Description,"Proc. Second Int'l Conf. Architectural Support Programming Languages and Operating Systems, CS Press, (microfiche) Order No. M805, pp. 100-104, 1987.
[17] S. Weiss and J.E. Smith, "Instruction Issue Logic for Pipelined Supercomputers,"Proc. Int'l Symp. Computer Architecture, Vol. 12, No. 3, June 1984, pp. 110-118.

Index Terms:
machine performance; first-order estimate; machine parallelism; instruction-level; machine pipelines; CRAY-1; MultiTitan; superscalar machine; parallel architectures; performance evaluation; pipeline processing.
N.P. Jouppi, "The Nonuniform Distribution of Instruction-Level and Machine Parallelism and its Effect on Performance," IEEE Transactions on Computers, vol. 38, no. 12, pp. 1645-1658, Dec. 1989, doi:10.1109/12.40844
Usage of this product signifies your acceptance of the Terms of Use.