This Article 
 Bibliographic References 
 Add to: 
The Importance of Prepass Code Scheduling for Superscalar and Superpipelined Processors
March 1995 (vol. 44 no. 3)
pp. 353-370

Abstract—Superscalar and superpipelined processors utilize parallelism to achieve peak performance that can be several times higher than that of conventional scalar processors. In order for this potential to be translated into the speedup of real programs, the compiler must be able to schedule instructions so that the parallel hardware is effectively utilized. Previous work has shown that prepass code scheduling helps to produce a better schedule for scientific programs, but the importance of prescheduling has never been demonstrated for control-intensive non-numeric programs. These programs are significantly different from the scientific programs because they contain frequent branches. The compiler must do global scheduling in order to find enough independent instructions.

In this paper, the code optimizer and scheduler of the IMPACT-I C compiler is described. Within this framework, we study the importance of prepass code scheduling for a set of production C programs. It is shown that, in contrast to the results previously obtained for scientific programs, prescheduling is not important for compiling control-intensive programs to the current generation of superscalar and superpipelined processors. However, if some of the current restrictions on upward code motion can be removed in future architectures, prescheduling would substantially improve the execution time of this class of programs on both superscalar and superpipelined processors.

[1] J.R. Goodman and W.-C. Hsu, “Code Scheduling and Register Allocation in Large Basic Blocks,” Conf. Proc. 1988 Int'l Conf. Supercomputing, pp. 442-452, July 1988.
[2] W.W. Hwu and P.P. Chang,“Exploiting parallel microprocessor microarchitectures with a compiler codegenerator,” Proc. 15th Ann. Int’l Symp. Computer Architecture, pp. 45-53, June 1988.
[3] J. L. Hennessy and T. R. Gross,“Postpass code optimization of pipeline constraints,”ACM Trans. Programming Language and System, vol. 5, pp. 442–448, 1983.
[4] J.A. Fisher,“Trace scheduling: A technique for global microcode compaction,” IEEE Trans. Computers, vol. 30, pp. 478-490, July 1981.
[5] J.R. Ellis, Bulldog: A Compiler for VLIW Architectures.Cambridge, Mass.: MIT Press, 1986.
[6] G. Chaitin, "Register Allocation and Spilling via Graph Coloring," Proc. SIGPLAN 82 Symp. Compiler Construction, ACM Press, Vol. 17, No. 6, June 1982, pp. 98-105.
[7] P.P. Chang, S.A. Mahlke, W.Y. Chen, N.J. Warter, and W.W. Hwu, "IMPACT: An Architectural Framework for Multiple-Issue Processors," Proc. 18th Ann. Int'l Symp. Computer Architecture, pp. 276-275,Toronto, Ontario, Canada, May 1991.
[8] G. Kane,MIPS R2000 RISC Architecture, Prentice Hall, Englewood Cliffs, N.J., 1987.
[9] Sun Microsystems, “The SPARC architecture manual,” part no. 800-1399-07, rev. 50, Mountain View, Calif., Aug. 1987.
[10] Advanced Micro Devices, “Am29000 32-bit streamlined instruction processor,” users manual, Sunnyvale, Calif., 1988.
[11] Intel, “i860 64-bit microprocessor,” order no. 240296-002, Santa Clara, Calif., Apr. 1989.
[12] Hewlett-Packard Company, “Precision architecture and instruction set reference manual, 3rdedition,” part no. 09740-90039, Apr. 1989.
[13] A.V. Aho, R. Sethi, and J.D. Ullman, Compilers, Principles, Techniques and Tools.New York: Addison-Wesley, 1985.
[14] P.P. Chang, S.A. Mahlke, W.Y. Chen, and W.-M.W. Hwu, “Profile-Guided Automatic Inline Expansion for C Programs,” Software–Practice and Experience, vol. 22, no. 5, pp. 349-369, May 1992.
[15] W.W. Hwu and P.P. Chang,“Achieving high instruction cache performance with an optimizingcompiler,” Proc. 16th Ann. Int’l Symp. on Computer Architecture, pp. 242-251, June 1989.
[16] P.P. Chang, S.A. Mahlke, and W.W. Hwu, "Using Profile Information to Assisst Classic Code Optimizations," Software—Practice and Experiences, vol. 21, no. 12, pp. 1,301-1,321, 1991.
[17] S.A. Mahlke,W.Y. Chen,J.C. Gyllenhaal,W.W. Hwu,P.O. Chang,, and T. Kiyohara,“Compiler code transformations for superscalar-based high-performancesystems,” Proc. Supercomputing’92, Nov. 1992.
[18] P.P. Chang and W.W. Hwu,“Forward semantic: A compiler-assisted instruction fetch method for heavilypipelined processors,” Proc. 22nd Int’l Workshop Microprogramming and Microarchitecture, pp. 188-198, Aug. 1989.
[19] P.P. Chang and W.W. Hwu,“Trace selection for compiling large C application programs tomicrocode,” Proc. 21st Int’l Microprogramming Workshop, pp. 21-29, Nov. 1988.
[20] W.W. Hwu, S.A. Mahlke, W.Y. Chen, P.P. Chang, N.J. Warter, R.A. Bringmann, R.G. Ouellette, R.E. Hank, T. Kiyohara, G.E. Haab, J.G. Holm,, and D.M. Lavery, ``The Superblock: An Effective Technique for VLIW and Superscalar Compilation,'' J. Supercomputing, vol. 7, pp. 9-50, 1993.
[21] R.P. Colwell et al., "A VLIW Architecture for a Trace Scheduling Compiler," Proc. Second Symp. Architectural Support for Programming Languages and Operating Systems, ACM, 1987, pp. 180-192.
[22] M.D. Smith,M.S. Lam,, and M.A. Horowitz,“Boosting beyond static scheduling in a superscalar processor,” Proc. 17th Ann. Int’l Symp. Computer Architecture, pp. 344-354, May 1990.
[23] P.P. Chang,N.J. Warter,S.A. Mahlke,W.Y. Chen,, and W.W. Hwu,“Three superblock scheduling models for superscalar and superpipelinedprocessors,” Center for Reliable and High-Performance Computing Report CRHC-91-25, Univ. of Illinois at Urbana-Champaign, Oct. 1991.
[24] S.A. Mahlke, W.Y. Chen, W.-m. Hwu, B.R. Rau, and M.S. Schlansker, “Sentinel Scheduling for VLIW and Superscalar Processors,” Proc. Fifth Int'l Conf. Architectural Support for Programming Languages and Operating Systems, pp. 238-247, Oct. 1992.
[25] P.P. Chang, W.Y. Chen, S.A. Mahlke, and W.-M.W. Hwu, “Comparing Static and Dynamic Code Scheduling for Multiple-Instruction-Issue Processors,” Proc. 24th Ann. IEEE/ACM Int'l Symp. Microarchitecture (MICRO), pp. 25-33, Nov. 1991.
[26] N.P. Jouppi and D.W. Wall,"Available Instruction-Level Parallelism for Superscalar and Superpipelined Machines," Proc. Third Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS), Assoc. of Computing Machinery,N.Y., Apr. 1989, pp. 272-282.

Index Terms:
Code scheduling, control-intensive programs, optimizing compiler, register allocation, superpipelined processors, superscalar processors.
Daniel M. Lavery, Pohua P. Chang, Scott A. Mahlke, William Y. Chen, Wen-mei W. Hwu, "The Importance of Prepass Code Scheduling for Superscalar and Superpipelined Processors," IEEE Transactions on Computers, vol. 44, no. 3, pp. 353-370, March 1995, doi:10.1109/12.372029
Usage of this product signifies your acceptance of the Terms of Use.