loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
An Optimal Instruction Scheduler for Superscalar Processor
March 1995 (vol. 6 no. 3)
pp. 303-313

Abstract—Performance in superscalar processing strongly depends on the compiler's ability to generate codes that can be executed by hardware in an optimal or near optimal order. Generating optimal code is an NP-complete problem. However, there is a need for highly optimized code, such as in superscalar or real-time systems. In this paper, an instruction scheduling scheme for optimizing a program trace is proposed. Optimized code can be arrived at without much redundant work, if some important features in code are well explored and utilized in scheduling. To formalize the task, two abstract models, one for a superscalar processor and the other for a program trace, are given. These two models reflect most of the characteristics of the scheduling problem. The interrelations between instructions and partial schedules are thoroughly studied, and dominance and equivalence relations on them are defined. These relations are then used to reduce the solution space and eventually help to produce optimal schedules. The results of experiments that show the promise of the proposed scheme are also presented.

Index Terms—Pipeline processors, sequencing and scheduling, optimization, prune and search, and NP-completeness.

[1] R. R. Oehler and R. D. Groves,“IBM RISC System/6000 processor architecture,”IBM J. Res. Develop., vol. 34, pp. 23–36, 1990. [2] N. Margulis,i860 Microprocessor Architecture. New York: McGraw-Hill, 1990. [3] V. J. Rayward-Smith,“UET scheduling with unit interprocessor communication delays,”Discrete Appl. Math., vol. 18, pp. 55–71, 1987. [4] M. Johnson,Superscalar Microprocessor Design. Englewood Cliffs, NJ: Prentice-Hall, 1990.[5] H. S. Warren, Jr.,“Instruction scheduling for the IBM RISC System/6000 processor,”IBM J. Res. Develop., vol. 34, pp. 85–92, 1990. [6] R. M. Tomasulo,“An efficient algorithm for exploiting multiple arithmetic units,”IBM J. Res. Develop., vol. 11, pp. 25–33, 1967.[7] V. Propescu, M. Schultz, J. Spracklen, G. Gibson, B. Lightner, and D. Isaman,“The megaflow architecture,”IEEE Micro, June 1991. [8] J.R. Ellis, Bulldog: A Compiler for VLIW Architectures.Cambridge, Mass.: MIT Press, 1986.[9] J. Bruno, J. W. Jones, and K. So,“Deterministic scheduling with pipelined processors,”IEEE Trans. Comput., vol. C-29, pp. 308–316, 1980.[10] J. L. Hennessy and T. R. Gross,“Postpass code optimization of pipeline constraints,”ACM Trans. Programming Language and System, vol. 5, pp. 442–448, 1983. [11] J. A. Fisher,“The VLIW machine: A multiprocessor for compiling scientific code,”IEEE Comput., pp. 45–53, July 1984.[12] J. A. Fisher,“Trace scheduling: A technique for global microcode compaction,”IEEE Trans. Comput., vol. C-30, pp. 478–490, 1981.[13] S. Melvin,“Exploiting fine-grained parallelism through a combination of hardware and software techniques,”inProc. Int. Conf. Parallel Comput., 1991.[14] M.R. Garey and D.S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness.New York: W.H. Freeman, 1979.[15] C. V. Ramamoorthy, K. M. Chandy, and M. J. Gonzalez Jr,“Optimal scheduling strategies in a multiprocessor system,”IEEE Trans. Comput., vol. C-21, pp. 137–146, 1972.[16] C. I. Yang, J. S. Wang, and R. C. T. Lee,“A branch-and-bound algorithm to solve the equal-execution-time job scheduling problem with precedence constraint and profile,”Comput. Oper. Res., vol. 16, pp. 257–269, 1989. [17] H. C. Chou,“A study of superscalar instruction scheduling problem,”Ph.D. dissertation, Inst. Comput. Sci. Inform. Eng., Nat. Chiao Tung Univ., Taiwan, 1992.[18] E. L. Lawler,Combinatorial Optimization: Networks and Matroids. New York: Holt, Rhinehart, and Winston, 1976.[19] M. R. Edward, N. N. Jurg, and D. Narsingh,Combinatiorial Algorithms: Theory and Practice. Englewood Cliffs, NJ: Prentice-Hall, 1977.[20] Y. H. Shiau and C. P. Chung,“Effects and handling of instruction class contention in superscalar processing,”submitted paper.[21] E. Lawer, J. K. Lenstra, C. Martel, B. Simons, and L. Stockmeyer,“Pipeline scheduling: A survey,”IBM Res. Rep. RJ 5738, San Jose, CA, 1987.

Citation:
Hong-Chich Chou, Chung-Ping Chung, "An Optimal Instruction Scheduler for Superscalar Processor," IEEE Transactions on Parallel and Distributed Systems, vol. 6, no. 3, pp. 303-313, Mar. 1995, doi:10.1109/71.372778
Usage of this product signifies your acceptance of the Terms of Use.