This Article 
 Bibliographic References 
 Add to: 
Vector Register Allocation
October 1992 (vol. 41 no. 10)
pp. 1290-1317

The problem of allocating vector registers on supercomputers is addressed in the context of compiling vector languages. Two subproblems must be solved to achieve good vector register allocation. First, the vector operations in the source program must be subdivided into sections that fit the hardware of the target machine. Second, the locality of reference of the vector operations must be improved via aggressive program transformations. Solutions to both of these problems, based on the use of novel aspects of data dependence, are presented. The techniques described extend naturally to scalar machines by observing that a scalar register is simply a vector register of length one.

[1] W. Abu-Sufah, "Improving the performance of virtual memory computers," Ph.D. dissertation, Univ. of Illinois at Urbana-Champaign, Dept. Comput. Sci. Rep. 78-945, Nov. 1978.
[2] A.V. Aho, R. Sethi, and J.D. Ullman, "Code optimization and finite Church-Rosser systems," inDesign and Optimization of Compilers, R. Rustin, Ed. Englewood Cliffs, NJ: Prentice-Hall, 1972.
[3] J. R. Allen, K. Kennedy, C. Porterfield, and J. Warren, "Conversion of control dependence to data dependence,"POPL, Jan. 1983.
[4] J. R. Allen, "Dependence analysis for subscripted variables and its application to program transformations," Ph.D. dissertation, Dept. Math. Sci., Rice Univ., May 1983.
[5] J. R. Allen, D. Callahan, and K. Kennedy, "Automatic decomposition of scientific programs for parallel execution," inConf. Record, 14th POPL, Jan. 1987.
[6] J.R. Allen and K. Kennedy, "PFC: A program to convert Fortran to parallel form," Rep. MASC TR 82-6, Dept. Math. Sci., Rice Univ., Houston, TX, Mar. 1982.
[7] J.R. Allen and K. Kennedy, "Automatic loop interchange," inProc. SIGPLAN '84 Symp. Comp. Construct., Montreal, Canada, July 1984.
[8] J.R. Allen and K. Kennedy, "A parallel programming environment,"IEEE Software, vol. 2, no. 4, pp. 21-29, July 1985.
[9] R. Allen and K. Kennedy, "Automatic translation of FORTRAN to vector form,"ACM Trans. Programming Languages Syst., vol. 9, no. 4, pp. 491-524, 1987.
[10] American National Standard Institute, Inc.,Amer. Nat. Standard Info. Syst. Program. Language Fortran(Fortran 90), Draft S8 Version 114, (X3.9-1990) ed., Washington, DC, Jan. 1990.
[11] J. Backus, "The history of FORTRAN I, II, and III,"ACM Sigplan Notices, vol. 13, no. 8, pp. 165-180, Aug. 1978.
[12] U. Banerjee, "Data dependence in ordinary programs," Rep. 76-837, Dept. Comput. Sci., Univ. of Illinois at Urbana-Champaign, Urbana, Nov. 1976.
[13] J. L. Bruno and R. Sethi, "Code generation for a one register machine,"J. ACM, vol. 23, no. 3, pp. 502-510, 1976.
[14] M. Burke and R. Cytron, "Interprocedural dependence analysis and parallelization," inProc. SIG-PLAN '86 Symp. Comp. Construct., Palo Alto, CA, June 1986, pp. 162-175.
[15] D. Callahan, J. Cocke, and K. Kennedy, "Estimating interlock and improving balance for pipelined architectures," inProc. 1987 Int. Conf. Parallel Processing, The Pennsylvania State Univ. Press, University Park, PA, Aug. 1987, pp. 295-304.
[16] G. J. Chaitin, M. A. Auslander, A. K. Chandra, J. Cocke, M. E. Hopkins, and P. W. Markstein, "Register allocation via coloring,"Computer Languages, vol. 6, pp. 47-57, 1981.
[17] G. J. Chaitin, "Register allocation and spilling via graph coloring,"SIGPLAN Not., vol. 17, no. 6, pp. 98-105, 1982.
[18] F. Chow and J. Hennessy, "Register allocation by priority-based coloring,"SIGPLAN Not., vol. 19, no. 6, pp. 222-232, 1984.
[19] J. J. Dongarra and S.C. Eisenstat, "Squeezing the most out of an algorithm in Cray Fortran," Tech. Rep. 9, Argonne Nat. Lab. Math. Comput. Sci. Division, Argonne, IL, May 1983.
[20] J. J. Dongarra, F. G. Gustavson, and A. Karp, "Implementing linear algebra algorithms for dense matrices on a vector pipeline machine,"Siam Rev., vol. 26, no. 1, pp. 91-112, Jan. 1984.
[21] K. Fong and T. L. Jordan, "Some linear algebra algorithms and their performance on the CRAY-1," Los Alamos Sci. Lab., UC-32, June 1977.
[22] D. Gannon, W. Jalby, and K. Gallivan, "Strategies for cache and local memory management by global program transformation," inProc. 1st Inc. Conf. Supercomputing, Athens, Greece, June 1987.
[23] A Goldberg and R. Paige, "Stream processing,"Conf. Rec. 1984 Symp. Lisp Functional Program., Aug. 1984, pp. 53-62.
[24] K. Kennedy, "Automatic translation of Fortran programs to vector form," Rice Tech. Rep. 476-029-4, Rice Univ., Oct. 1980.
[25] D. J. Kuck, "Parallel processing of ordinary programs,"Advances in Computers, vol. 15, pp. 119-179, 1976.
[26] D. J. Kuck,The Structure of Computers and Computations, vol. 1. New York: Wiley, 1978.
[27] D. J. Kuck, R.H. Kuhn, B. Leasure, D.A. Padua, and M. Wolfe, "Compiler transformation of dependence graphs," inConf. Rec. 8th ACM Symp. Principles Program. Languages, Williamsburg, VA, Jan. 1981.
[28] D. J. Kuck, R. H. Kuhn, B. Leasure, and M. Wolfe, "The structure of an advanced vectorizer for pipelined processors," inProc. IEEE Comp. Soc. 4th Int. Comput. Software Appl. conf., IEEE, Oct. 1980.
[29] L. Lamport, "The Coordinate Method for the parallel execution of iterative Do loops", SRI Tech. Rep. CA-7608-0221, Aug. 2, 1976 (Revised May 31, 1979 and Oct. 21, 1981).
[30] D.A. Padua, "Multiprocessors: Discussion of some theoretical and practical problems," Tech. Rep. UIUCDCS-R-79-90, Univ. of Illinois at Urbana-Champaign, Urbana, IL, Nov. 1979.
[31] M. L. Powell, "A portable optimizing compiler for Modula-2,"SIGPLAN Not., vol. 19, vol. 6, pp. 310-318, 1984.
[32] R. M. Russel, "The CRAY-1 computer system,"Commun. ACM, vol. 21, no. 1, pp. 63-72, Jan. 1978.
[33] R. Sethi, "Testing for the Church-Rosser property,"J. ACM, vol. 21, no. 4, Oct. 1974.
[34] R. Sethi, "Complete register allocation problems,"SIAM J. Comput., vol. 4, no. 3, Sept., 1975.
[35] R. L. Sites, "An analysis of the Cray-1 computer," inProc. 5th Annu. Symp. Comput. Architecture, Apr. 1978, pp. 101-106.
[36] J. Warren, "A hierarchical approach to reordering transformations," inProc. Twelfth Symp. Principles Program. Languages, Jan. 1984.
[37] D. Wedel, "FORTRAN for the Texas Instruments ASC system,"Sigplan Notices, vol. 10, no. 3, Mar. 1975, pp. 119-132.
[38] M. J. Wolfe, "Optimizing supercompilers for supercomputers," Ph.D. thesis, Ctr. Supercomput. Res. and Development, Univ. Illinois, Urbana-Champaign, 1980.

Index Terms:
vector register allocation; supercomputers; compiling vector languages; aggressive program transformations; data dependence; parallel processing; program compilers.
R. Allen, K. Kennedy, "Vector Register Allocation," IEEE Transactions on Computers, vol. 41, no. 10, pp. 1290-1317, Oct. 1992, doi:10.1109/12.166606
Usage of this product signifies your acceptance of the Terms of Use.