This Article 
 Bibliographic References 
 Add to: 
Scheduling Superblocks with Bound-Based Branch Trade-Offs
August 2001 (vol. 50 no. 8)
pp. 784-797

Abstract—Since instruction level parallelism in basic blocks is often limited, compilers increase performance by creating superblocks that allow operations to be issued speculatively. This is difficult in general because each branch competes for the processor's limited resources. Previous work manages the performance trade-offs that exist between branches only indirectly. We show here that dependence and resource constraints can be used to gather explicit knowledge about scheduling trade-offs between branches. This paper's first contribution is a set of new, tighter lower bounds on the execution times of superblocks that specifically account for the dependence and resource conflicts between pairs of branches. This paper's second contribution is a novel superblock scheduling heuristic that finds high performance schedules by determining the operations that each branch needs to be scheduled early and selecting branches with compatible needs that favor beneficial branch trade-offs. Performance evaluations for superblocks from SPECint95 indicate that our bounds are very tight and that our scheduling heuristic outperforms well-known superblock scheduling algorithms.

[1] T. Ball and J. Larus, “Branch Prediction for Free,” Proc. SIGPLAN '93 Conf. Programming Language Design and Implementation, 1993.
[2] T. Ball and J. Larus, “Optimally Profiling and Tracing Programs,” ACM Trans. Programming Languages and Systems, pp. 1319-1360, 1994.
[3] R. Bringmann, “Compiler-Controlled Speculation,” PhD thesis, technical report, Dept. of Computer Science, Univ. of Illi nois, 1995.
[4] P. Brucker, M. Garey, and D. Johnson, “Scheduling Equal-Length Tasks under Treelike Precedence Constraints to Minimize Maximum Lateness,” Math. Operations Research, vol. 2, pp. 275-284, 1977.
[5] C. Chekuri, R. Johnson, B.K. Natarajan, B.R. Rau, and M. Schalnsker, “An Analysis of Profile-Driven Instruction Level Parallel Scheduling with Application to Super Blocks,” Proc. 29th Ann. Int'l Symp. Microarchitecture (MICRO-29), 1996.
[6] S. Davidson, D. Landskov, B. Shriver, and P. Mallet, “Some Experiments in Local Code Microcode Compaction for Horizontal Machines,” IEEE Trans. Computers, vol. 30, 1981.
[7] B.L. Deitrich, personal communication, 1999.
[8] B.L. Deitrich, B.C. Cheng, and W.W. Hwu, “Improving Static Branch Prediction in a Compiler,” Proc. Int'l Conf. Parallel Architectures and Compilation Techniques, 1998.
[9] B.L. Deitrich and W.W. Hwu, “Speculative Hedge: Regulating Compile-Time Speculation against Profile Variations,” Proc. 29th Int'l Symp. Microarchitecture, pp. 70-79, 1996.
[10] A. Eichenberger and S. Lobo, “Efficient Edge Profiling for ILP Processors,” Proc. Int'l Conf. Parallel Architectures and Compilation Techniques, 1998.
[11] A. Eichenberger and W.M. Meleis, “Balance Scheduling: Weighting Branch Trade-Offs in Superblocks,” Proc. 32nd Ann. Int'l Symp. Microarchitecture (MICRO-32), 1999.
[12] D. August et al., “The Program Decision Logic Approach to Predicated Execution,” Proc. 26th Int'l Symp. Computer Architecture, 1999.
[13] J. Fisher, “Trace Scheduling: A Technique for Global Microcode Compaction,” IEEE Trans. Computers, vol. 30, pp. 478-490, 1981.
[14] T. Hu, “Parallel Sequencing and Assembly Line Problems,” Operations Research, vol. 9, pp. 841-848, 1961.
[15] W. Hwu, S. Mahlke, W. Chen, P. Chang, N. Warter, R. Bringmann, R. Ouellette, R. Hank, T. Kiyohara, G. Haab, J. Holm, and D. Lavery, “The Superblock: An Effective Technique for VLIW and Superscalar Compilation,” J. Supercomputing, pp. 229-248, 1993.
[16] W. Kohler, “A Preliminary Evaluation of the Critical Path Method for Scheduling Tasks on Multiprocessor Systems,” IEEE Trans. Computers, vol. 24, 1975.
[17] M. Langevin and E. Cerny, “A Recursive Technique for Computing Lower-Bound Performance of Schedules,” ACM Trans. Design Automation, vol. 1, pp. 443-455, 1996.
[18] S. Mahlke, D. Lin, W. Chen, R. Hank, and R. Bringmann, “Effective Compiler Support for Predicated Execution Using the Hyperblock,” Proc. 25th Int'l Symp. Microarchitecture, pp. 45-54, 1992.
[19] C. Ramamoorthy and M. Tsuchiya, “A High Level Language Horizontal Microprogramming,” IEEE Trans. Computers, 1974.
[20] M. Rim and R. Jain, “Lower-Bound Performance Estimation for the High-Level Synthesis Scheduling Problem,” IEEE Trans. Computer-Aided Design, vol. 13, pp. 452-459, 1994.
[21] M. Schlansker and V. Kathail, “Critical Path Reduction for Scalar Programs,” Proc. 28th Int'l Symp. Microarchitecture, 1995.
[22] B. Simons, “Multiprocessor Scheduling of Unit-Time Jobs with Arbitrary Release Times and Deadlines,” SIAM J. Computing, vol. 12, pp. 294-299, 1983.

Index Terms:
Superblock, scheduling heuristic, lower bound, ILP compiler technique.
W.M. Meleis, A.E. Eichenberger, I.D. Baev, "Scheduling Superblocks with Bound-Based Branch Trade-Offs," IEEE Transactions on Computers, vol. 50, no. 8, pp. 784-797, Aug. 2001, doi:10.1109/TC.2001.947007
Usage of this product signifies your acceptance of the Terms of Use.