
This Article  
 
Share  
Bibliographic References  
Add to:  
Digg Furl Spurl Blink Simpy Del.icio.us Y!MyWeb  
Search  
 
ASCII Text  x  
R. Allen, K. Kennedy, "Vector Register Allocation," IEEE Transactions on Computers, vol. 41, no. 10, pp. 12901317, October, 1992.  
BibTex  x  
@article{ 10.1109/12.166606, author = {R. Allen and K. Kennedy}, title = {Vector Register Allocation}, journal ={IEEE Transactions on Computers}, volume = {41}, number = {10}, issn = {00189340}, year = {1992}, pages = {12901317}, doi = {http://doi.ieeecomputersociety.org/10.1109/12.166606}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, }  
RefWorks Procite/RefMan/Endnote  x  
TY  JOUR JO  IEEE Transactions on Computers TI  Vector Register Allocation IS  10 SN  00189340 SP1290 EP1317 EPD  12901317 A1  R. Allen, A1  K. Kennedy, PY  1992 KW  vector register allocation; supercomputers; compiling vector languages; aggressive program transformations; data dependence; parallel processing; program compilers. VL  41 JA  IEEE Transactions on Computers ER   
The problem of allocating vector registers on supercomputers is addressed in the context of compiling vector languages. Two subproblems must be solved to achieve good vector register allocation. First, the vector operations in the source program must be subdivided into sections that fit the hardware of the target machine. Second, the locality of reference of the vector operations must be improved via aggressive program transformations. Solutions to both of these problems, based on the use of novel aspects of data dependence, are presented. The techniques described extend naturally to scalar machines by observing that a scalar register is simply a vector register of length one.
[1] W. AbuSufah, "Improving the performance of virtual memory computers," Ph.D. dissertation, Univ. of Illinois at UrbanaChampaign, Dept. Comput. Sci. Rep. 78945, Nov. 1978.
[2] A.V. Aho, R. Sethi, and J.D. Ullman, "Code optimization and finite ChurchRosser systems," inDesign and Optimization of Compilers, R. Rustin, Ed. Englewood Cliffs, NJ: PrenticeHall, 1972.
[3] J. R. Allen, K. Kennedy, C. Porterfield, and J. Warren, "Conversion of control dependence to data dependence,"POPL, Jan. 1983.
[4] J. R. Allen, "Dependence analysis for subscripted variables and its application to program transformations," Ph.D. dissertation, Dept. Math. Sci., Rice Univ., May 1983.
[5] J. R. Allen, D. Callahan, and K. Kennedy, "Automatic decomposition of scientific programs for parallel execution," inConf. Record, 14th POPL, Jan. 1987.
[6] J.R. Allen and K. Kennedy, "PFC: A program to convert Fortran to parallel form," Rep. MASC TR 826, Dept. Math. Sci., Rice Univ., Houston, TX, Mar. 1982.
[7] J.R. Allen and K. Kennedy, "Automatic loop interchange," inProc. SIGPLAN '84 Symp. Comp. Construct., Montreal, Canada, July 1984.
[8] J.R. Allen and K. Kennedy, "A parallel programming environment,"IEEE Software, vol. 2, no. 4, pp. 2129, July 1985.
[9] R. Allen and K. Kennedy, "Automatic translation of FORTRAN to vector form,"ACM Trans. Programming Languages Syst., vol. 9, no. 4, pp. 491524, 1987.
[10] American National Standard Institute, Inc.,Amer. Nat. Standard Info. Syst. Program. Language Fortran(Fortran 90), Draft S8 Version 114, (X3.91990) ed., Washington, DC, Jan. 1990.
[11] J. Backus, "The history of FORTRAN I, II, and III,"ACM Sigplan Notices, vol. 13, no. 8, pp. 165180, Aug. 1978.
[12] U. Banerjee, "Data dependence in ordinary programs," Rep. 76837, Dept. Comput. Sci., Univ. of Illinois at UrbanaChampaign, Urbana, Nov. 1976.
[13] J. L. Bruno and R. Sethi, "Code generation for a one register machine,"J. ACM, vol. 23, no. 3, pp. 502510, 1976.
[14] M. Burke and R. Cytron, "Interprocedural dependence analysis and parallelization," inProc. SIGPLAN '86 Symp. Comp. Construct., Palo Alto, CA, June 1986, pp. 162175.
[15] D. Callahan, J. Cocke, and K. Kennedy, "Estimating interlock and improving balance for pipelined architectures," inProc. 1987 Int. Conf. Parallel Processing, The Pennsylvania State Univ. Press, University Park, PA, Aug. 1987, pp. 295304.
[16] G. J. Chaitin, M. A. Auslander, A. K. Chandra, J. Cocke, M. E. Hopkins, and P. W. Markstein, "Register allocation via coloring,"Computer Languages, vol. 6, pp. 4757, 1981.
[17] G. J. Chaitin, "Register allocation and spilling via graph coloring,"SIGPLAN Not., vol. 17, no. 6, pp. 98105, 1982.
[18] F. Chow and J. Hennessy, "Register allocation by prioritybased coloring,"SIGPLAN Not., vol. 19, no. 6, pp. 222232, 1984.
[19] J. J. Dongarra and S.C. Eisenstat, "Squeezing the most out of an algorithm in Cray Fortran," Tech. Rep. 9, Argonne Nat. Lab. Math. Comput. Sci. Division, Argonne, IL, May 1983.
[20] J. J. Dongarra, F. G. Gustavson, and A. Karp, "Implementing linear algebra algorithms for dense matrices on a vector pipeline machine,"Siam Rev., vol. 26, no. 1, pp. 91112, Jan. 1984.
[21] K. Fong and T. L. Jordan, "Some linear algebra algorithms and their performance on the CRAY1," Los Alamos Sci. Lab., UC32, June 1977.
[22] D. Gannon, W. Jalby, and K. Gallivan, "Strategies for cache and local memory management by global program transformation," inProc. 1st Inc. Conf. Supercomputing, Athens, Greece, June 1987.
[23] A Goldberg and R. Paige, "Stream processing,"Conf. Rec. 1984 Symp. Lisp Functional Program., Aug. 1984, pp. 5362.
[24] K. Kennedy, "Automatic translation of Fortran programs to vector form," Rice Tech. Rep. 4760294, Rice Univ., Oct. 1980.
[25] D. J. Kuck, "Parallel processing of ordinary programs,"Advances in Computers, vol. 15, pp. 119179, 1976.
[26] D. J. Kuck,The Structure of Computers and Computations, vol. 1. New York: Wiley, 1978.
[27] D. J. Kuck, R.H. Kuhn, B. Leasure, D.A. Padua, and M. Wolfe, "Compiler transformation of dependence graphs," inConf. Rec. 8th ACM Symp. Principles Program. Languages, Williamsburg, VA, Jan. 1981.
[28] D. J. Kuck, R. H. Kuhn, B. Leasure, and M. Wolfe, "The structure of an advanced vectorizer for pipelined processors," inProc. IEEE Comp. Soc. 4th Int. Comput. Software Appl. conf., IEEE, Oct. 1980.
[29] L. Lamport, "The Coordinate Method for the parallel execution of iterative Do loops", SRI Tech. Rep. CA76080221, Aug. 2, 1976 (Revised May 31, 1979 and Oct. 21, 1981).
[30] D.A. Padua, "Multiprocessors: Discussion of some theoretical and practical problems," Tech. Rep. UIUCDCSR7990, Univ. of Illinois at UrbanaChampaign, Urbana, IL, Nov. 1979.
[31] M. L. Powell, "A portable optimizing compiler for Modula2,"SIGPLAN Not., vol. 19, vol. 6, pp. 310318, 1984.
[32] R. M. Russel, "The CRAY1 computer system,"Commun. ACM, vol. 21, no. 1, pp. 6372, Jan. 1978.
[33] R. Sethi, "Testing for the ChurchRosser property,"J. ACM, vol. 21, no. 4, Oct. 1974.
[34] R. Sethi, "Complete register allocation problems,"SIAM J. Comput., vol. 4, no. 3, Sept., 1975.
[35] R. L. Sites, "An analysis of the Cray1 computer," inProc. 5th Annu. Symp. Comput. Architecture, Apr. 1978, pp. 101106.
[36] J. Warren, "A hierarchical approach to reordering transformations," inProc. Twelfth Symp. Principles Program. Languages, Jan. 1984.
[37] D. Wedel, "FORTRAN for the Texas Instruments ASC system,"Sigplan Notices, vol. 10, no. 3, Mar. 1975, pp. 119132.
[38] M. J. Wolfe, "Optimizing supercompilers for supercomputers," Ph.D. thesis, Ctr. Supercomput. Res. and Development, Univ. Illinois, UrbanaChampaign, 1980.