This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Resource-Constrained Software Pipelining
December 1995 (vol. 6 no. 12)
pp. 1248-1270

Abstract—This paper presents a software pipelining algorithm for the automatic extraction of fine-grain parallelism in general loops. The algorithm accounts for machine resource constraints in a way that smoothly integrates the management of resource constraints with software pipelining. Furthermore, generality in the software pipelining algorithm is not sacrificed to handle resource constraints, and scheduling choices are made with truly global information. Proofs of correctness and the results of experiments with an implementation are also presented.

[1] M. Annaratone,E. Arnould,T. Gross,H.T. Kung,M. Lam,O. Menzilcioglu,K. Sarocky,, and J.A. Webb,“Warp architecture and implementation,” Proc. 13th Ann. Symp. Computer Architecture, pp. 346-356, June 1986.
[2] A. Aiken,“Compaction-based parallelization,” PhD thesis, Dept. of Computer Science Technical Report no. 88-922, Cornell Univ., 1988.
[3] A. Aiken,“A theory of compaction-based parallelization,” Theoretical Computer Science, vol. 73, pp. 121-154, 1990.
[4] V.H. Allan,J. Janardhan,R.M. Lee,, and M. Srinivas,“Enhanced region scheduling on a program dependence graph,” Proc. 25th Int’l Symp. and Workshop Microarchitecture (MICRO-25), Dec. 1992.
[5] J.R. Allen,K. Kennedy,C. Porterfield,, and J. Warren,“Conversion of control dependence to data dependence,” Proc. 1983 Symp. Principles of Programming Languages, pp. 177-189, Jan. 1983.
[6] A. Aiken and A Nicolau,“Optimal loop parallelization,” Proc. 1988 ACM SIGPLAN Conf. Programming Language Design and Implementation, pp. 308-317, June 1988.
[7] A. Aiken and A. Nicolau,“Perfect pipelining: A new loop parallelization technique,” Proc. 1988 European Symp. Programming, pp. 221-235, Lecture Notes in Computer Science, no. 300, Springer-Verlag, Mar. 1988.
[8] A. Aiken and A. Nicolau,“A realistic resource-constrained software pipelining algorithm,” Advances in Languages and Compilers for Parallel Processing, pp. 274-290. MIT Press, 1991.
[9] J.L. Baer,Computer Systems Architecture. Computer Press, 1980.
[10] A.E. Charlesworth,“An approach to scientific array processing: The architectural design of the AP-12b/FPS-164 family,” Computer, vol. 14, no. 3, pp. 18-27, Mar. 1981.
[11] Technical Summary.Palo Alto, Calif.: Cydrome Inc., 1987.
[12] K. Ebcioglu,“A compilation technique for software pipelining of loops with conditional jumps,” Proc. 20th Ann. Workshop Microprogramming, pp. 69-79, Dec. 1987.
[13] K. Ebcioglu and A. Nicolau,“A global resource-constrained parallelization technique,” Proc. ACM SIGARCH Int’l Conf. Supercomputing, June 1989.
[14] K. Ebioglu and T. Nakatani,“A new compilation technique for parallelizing loops with unpredictable branches on a VLIW architecture,” Languages and Compilers for Parallel Computing, pp. 213-229.Cambridge, Mass.: MIT Press, 1990.
[15] J. Fisher,“2n-way jump microinstruction hardware and an effective instruction binding method,” Proc. 13th Ann. Workshop Microprogramming, pp. 64-75, Dec. 1980.
[16] J.A. Fisher,“Trace scheduling: A technique for global microcode compaction,” IEEE Trans. Computers, vol. 30, no. 7, pp. 478-490, July 1981.
[17] J. Ferrante,K.J. Ottenstein,, and J.D. Warren,“The program dependence graph and its use in optimization,” ACM Trans. Programming Languages and Systems, vol. 9, no. 3, pp. 319-349, June 1987.
[18] G. Gao,Y. Wong,, and Q. Ning,“A timed Petri-net model for fine grain loop scheduling,” Proc. ACM SIGPLAN’91 Conf. Programming Language Design and Implementation, pp. 204-218, June 1991.
[19] R.A. Huff,“Lifetime-sensitive modulo scheduling,” Proc. ACM SIGPLAN’93 Conf. Programming Language Design and Implementation, pp. 258-267, June 1993.
[20] R.B. Jones,“Constrained software pipelining,” Master’s thesis, Dept. of Computer Science, Utah State Univ., Logan, Utah, Sept. 1991.
[21] D.J. Kuck,R. Kuhn,D. Padua,B. Leasure,, and M. Wolfe,“Dependence graphs and compiler optimizations,” Proc. 1981 SIGACT-SIGPLAN Symp. Principles of Programming Languages, pp. 207-218, Jan. 1981.
[22] K. Karplus and A. Nicolau,“Efficient hardware for multi-way jumps and pre-fetches,” Proc. 18th Ann. Workshop Microprogramming, pp. 11-18, Dec. 1985.
[23] P.M. Kogge,“The microprogramming of pipelined processors,” Proc. Fourth Ann. Int’l Symp. Computer Architecture, 1977.
[24] R.M. Lee and V.H. Allan,“Advanced software pipelining and the program dependence graph,” Proc. Fourth IEEE Symp. Parallel and Distributed Processing, Dec. 1992.
[25] M. Lam,“A systolic array optimizing compiler,” PhD thesis, Carnegie Mellon Univ., 1987.
[26] S.M. Moon and K. Ebcioglu,“An efficient resource-constrained global scheduling technique for superscalar and VLIW processors,” Proc. 25th Int’l Symp. and Workshop Microarchitecture (MICRO-25), pp. 55-71, Dec. 1992.
[27] T. Nakatani and K. Ebcioglu,“’Combining’as a compilation technique for VLIW architectures,” Proc. 22nd Ann. Workshop Microprogramming, pp. 43-55, 1989.
[28] T. Nakatani and K. Ebcioglu,“Using a lookahead window in a compaction-based parallelizing compiler,” Proc. 23rd Ann. Workshop Microprogramming, 1990.
[29] A. Nicolau,“Uniform parallelism exploitation in ordinary programs,” Proc. 1985 Int’l Conf. Parallel Processing, pp. 614-618, Aug. 1985.
[30] A. Nicolau,K. Pingali,, and A. Aiken,“Fine-grain compilation for pipelined machines,” J. Supercomputing, vol. 1, Aug. 1988.
[31] K. Pingali,M. Beck,R. Johnson,M. Moudgill,, and P. Stodghill,“Dependence flow graphs: An algebraic approach to program dependences,” Proc. 1991 Symp. Principles of Programming Languages, pp. 67-78, Jan. 1991.
[32] R. Potasman,A. Nicolau,, and H.G. Wang,“Register allocation, renaming and their impact on fine-grain parallelism,” Proc. 1991 Workshop Languages and Compilers for Parallel Computing, pp. 218-235, Lecture Notes in Computer Science, no. 589, Springer-Verlag, Apr. 1992.
[33] M. Rajagopalan and V.H. Allan,“Efficient scheduling of fine grain parallelism in loops,” Proc. 26th Ann. Int’l Symp. Microarchitecture, pp. 2-11, Dec. 1993.
[34] B.R. Rau and J. Fisher,“Instruction-level parallel processing: History, overview, and perspective,” J. SuperComputing, vol. 7, nos. 1/2, Jan. 1993.
[35] B.R. Rau and C.D. Glaeser,“Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientificcomputing,” Proc. 14th Ann. Workshop Microprogramming, pp. 183-198, Oct. 1981.
[36] B.R. Rau,P.P. Tirumalai,, and M.S. Schlansker,“Register allocation for software pipelined loops,” Proc. ACM SIGPLAN’92 Conf. Programming Language Design and Implementation, pp. 283-299, June 1992.
[37] B. Su,S. Ding,, and J. Zia,“Urpr—An extension of urcr for software pipelining,” Proc. 19th Ann. Workshop Microprogramming, pp. 104-108, Oct. 1986.
[38] B. Su,S. Ding,, and J. Zia,“GURPR—A method for global software pipelining,” Proc. 20th Ann. Workshop Microprogramming, pp. 88-96, Dec. 1987.
[39] U. Schwiegelshohn,F. Gasperoni,, and K. Ebcioglu,“On optimal parallelization of arbitrary loops, J. Parallel and Distributed Computing, vol. 11, no. 2, pp. 130-134, 1991.
[40] R.F. Touzeau,“A Fortran compiler for the FPS-164 scientific computer,” Proc. 1984 ACM SIGPLAN Symp. Compiler Construction, pp. 48-57, June 1984.
[41] N.J. Warter,G.E. Haab,, and J.W. Bockhau,“Enhanced modulo scheduling for loops with conditional branches,” Proc. 25th Int’l Symp. and Workshop Microarchitecture (MICRO-25), Dec. 1992.
[42] N.J. Warter,S.A. Mahlke,W.W. Hwu,, and B.R. Rau,“Reverse if-conversion,” Proc. ACM SIGPLAN’93 Conf. Programming Language Design and Implementation, pp. 290-299, June 1993.

Index Terms:
Software pipelining, instruction scheduling, program optimization, fine-grain parallelism, global scheduling.
Citation:
Alexander Aiken, Alexandru Nicolau, Steven Novack, "Resource-Constrained Software Pipelining," IEEE Transactions on Parallel and Distributed Systems, vol. 6, no. 12, pp. 1248-1270, Dec. 1995, doi:10.1109/71.476167
Usage of this product signifies your acceptance of the Terms of Use.