|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
| ASCII Text | x | ||
| T. Nakatani, K. Ebcioglu, "Making Compaction-Based Parallelization Affordable," IEEE Transactions on Parallel and Distributed Systems, vol. 4, no. 9, pp. 1014-1029, September, 1993. | |||
| BibTex | x | ||
| @article{ 10.1109/71.243528, author = {T. Nakatani and K. Ebcioglu}, title = {Making Compaction-Based Parallelization Affordable}, journal ={IEEE Transactions on Parallel and Distributed Systems}, volume = {4}, number = {9}, issn = {1045-9219}, year = {1993}, pages = {1014-1029}, doi = {http://doi.ieeecomputersociety.org/10.1109/71.243528}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - JOUR JO - IEEE Transactions on Parallel and Distributed Systems TI - Making Compaction-Based Parallelization Affordable IS - 9 SN - 1045-9219 SP1014 EP1029 EPD - 1014-1029 A1 - T. Nakatani, A1 - K. Ebcioglu, PY - 1993 KW - Index Termscompaction-based parallelization; code explosion problem; software pipelining; loop parallelization; software lookahead heuristic; VLIW parallelizing compiler; instruction-level parallelism; branch-intensive code; AIX utilities; sort; fgrep; sed; yacc; compress; instruction sets; parallel architectures; parallel programming; pipeline processing; program VL - 4 JA - IEEE Transactions on Parallel and Distributed Systems ER - | |||
Compaction-based parallelization suffers from long compile time and large code size because of its inherent code explosion problem. If software pipelining is performed for loop parallelization along with compaction, as in the authors' compiler, the code explosion problem becomes more serious. The authors propose the software lookahead heuristic for use in software pipelining, which allows inter-basic-block movement of code within a prespecified number of operations, called the software lookahead window, on any path emanating from the currently processed instruction at compile time. Software lookahead enables instruction-level parallelism to be exploited in a much greater code area than a single basic block, but the lookahead region is still limited to a constant depth by means of a user-specifiable window, and thus code explosion is restricted. The proposed scheme has been implemented in the authors' VLIW parallelizing compiler. To study the code explosion problem and instruction-level parallelism for branch-intensive code, they compiled five AIX utilities: sort, fgrep, sed, yacc, and compress. It is demonstrated that the software lookahead heuristic effectively alleviates the code explosion problem while successfully extracting a substantial amount of inter-basic-block parallelism.
[1] A. V. Aho, R. Sethi, and J. D. Ullman,Compilers: Principles, Techniques, and Tools. Reading, MA: Addison-Wesley, 1986.
[2] A. Aiken, "Compaction-based parallelization," Ph.D. dissertation, Dep. Comput. Sci., Cornell Univ., 1988.
[3] A. Aiken and A. Nicolau, "Perfect Pipelining: A New Loop Parallelization Technique,"Proc. European Symp. Programming, 1988, pp. 221-235.
[4] A. Aiken and A. Nicolau, "A development environment for horizontal microcode,"IEEE Trans. Software Eng., vol. 14, no. 5, pp. 584-594, 1988.
[5] A. Aiken and A. Nicolau, "A realistic resource-constrained software pipelining algorithm," inAdvances in Languages and Compilers for Parallel Computing, A. Nicolauet al., Eds., Research Monographs in Parallel and Distributed Computing, MIT Press, 1990, pp. 274-290.
[6] M. Auslander and M. Hopkins, "An overvier of the PL.8 compiler," inProc. ACM SIGPLAN '82 Symp. Compiler Construct., 1982.
[7] D. Bernstein, D. Cohen, and H. Krawczyk, "Code duplication: An assist for global instruction scheduling," inProc. 24th Annu. Int. Symp. Microarchitecture, ACM and IEEE, 1991, pp. 103-113.
[8] D. Bernstein and M. Rodeh, "Global Instruction Scheduling for Superscalar Machines,"Proc. Conf. Programming Language Design and Implementation, Amer. Assoc. Computer Machinery, New York, 1991, pp. 241-255.
[9] K. Ebcioglu, "A compilation technique for software pipelining of loops with conditional jumps," inProc. Twentieth Annu. Workshop Microprogramming (MICRO-20), Association of Computing Machinery, Dec. 1987, pp. 69-79.
[10] K. Ebcioglu, "Some design ideas for a VLIW architecture for sequentialnatured software," inParallel Processing (Proc. IFIP WG 10.3 Working Conf. Parallel Processing), M. Cosnardet al., Eds., North Holland, 1988, pp. 3-21.
[11] K. Ebcioglu and A. Nicolau, "A global resource-constrained parallelization technique," inProc. Third Int. Conf. Supercomputing, Crete, ACM, 1989, pp. 154-163.
[12] K. Ebcioglu and T. Nakatani, "A new compilation technique for parallelizing loops with unpredictable branches on a VLIW architecture," inLanguages and Compilers for Parallel Computing, D. Gelernteret al., Eds., Research Monographs in Parallel and Distributed Computing, 1989, pp. 213-229, MIT Press.
[13] K. Ebcioglu and R. Groves, "Some global compiler optimizations and architectural features for improving the performance of superscalars," Report RC 16145, IBM T. J. Watson Research Center, 1990.
[14] J. Ellis,Bulldog: A Compiler for VLIW Architectures, MIT Press, Cambridge, MA, 1986, pp. 260-261.
[15] J. Ferrante, K. Ottenstein, and J. Warren, "The program dependence graph and its use in optimization,"ACM Trans. Program. Lang. Syst., vol. 9, no. 3, pp. 319-349, July 1987.
[16] J. Fisher, "Trace scheduling: A technique for global microcode compaction,"IEEE Trans. Comput., vol. C-30, no. 7, pp. 478-490, 1981.
[17] R. Gupta and M. Soffa, "Region scheduling: An approach for detecting and redistributing parallelism,"IEEE Trans. Software Eng., vol. 16, no. 4, pp. 421-431, 1990.
[18] P. Hsu, "Highly concurrent scalar processing," Ph.D. dissertation, Dep. Comput. Sci., Univ. of Illinois at Urbana-Champaign, 1985.
[19] M. Lam, "Software Pipelining: An Effective Scheduling Technique for VLIW Machines,"Proc. Sigplan 88 Conf. Programming Language Design and Implementation, ACM, New York, 1988, pp. 318-328.
[20] S. Moon and K. Ebcioglu, "An efficient resource-constrained global scheduling technique for superscalar and VLIW processors," inProc. MICRO-25, 1992, to be published.
[21] T. Nakatani and K. Ebcioglu, "'Combining' as a compilation technique for a VLIW architecture," inProc. 22nd Annu. Int. Workshop Microprogramming and Microarchitecture, ACM and IEE, 1989, pp. 43-55.
[22] T. Nakatani and K. Ebcioglu, "Using a lookahead window in compaction-based parallelzing compiler, " inProc. 23rd Microprogramming Workshop (MICRO-23), Orlando, FL, Nov. 1990.
[23] A. Nicolau, "Percolation scheduling: A parallel compilation technique," TR 85-678, Dep. Comput. Sci., Cornell Univ., 1985.
[24] R. Tomasulo, "An efficient algorithm for exploiting multiple arithmetic units,"IBM J. Res. Develop., vol. 11, no. 1, pp. 25-33, 1967.
[25] S. H. Warren, M. A. Auslander, G. J. Chaitin, A. C. Chibib, M. E. Hopkins, and A. L. MacKay, "Final code generation in the PL.8 compiler," Report RC 11974, IBM T. J. Watson Research Center, 1986.

