This Article 
 Bibliographic References 
 Add to: 
Compiler and Run-Time Support for Exploiting Regularity within Irregular Applications
February 2000 (vol. 11 no. 2)
pp. 119-135

Abstract—This paper starts from a well-known idea, that structure in irregular problems improves sequential performance, and tries to show that the same structure can also be exploited for parallelization of irregular problems on a distributed-memory multicomputer. In particular, we extend a well-known parallelization technique called run-time compilation to use structure information that is explicit on the array subscripts. This paper presents a number of internal representations suited to particular access patterns and shows how various preprocessing structures such as translation tables, trace arrays, and interprocessor communication schedules can be encoded in terms of one or more of these representations. We show how loop and index normalization are important for detection of irregularity in array references, as well as the presence of locality in such references. This paper presents methods for detection of irregularity, feasibility of inspection, and finally, placement of inspectors and interprocessor communication schedules. We show that this process can be automated through extensions to an HPF/Fortran-77 distributed-memory compiler (PARADIGM) and a new run-time support for irregular problems (PILAR) that uses a variety of internal representations of communication patterns. We devise performance measures which consider the relationship between the inspection cost, the execution cost, and the number of times the executor is invoked so that a comparison of the competing schemes can be performed independent of the number of iterations. Finally, we show experimental results on an IBM SP-2 that validate our approach. These results show that dramatic improvements in both memory requirements and execution time can be achieved by using these techniques.

[1] G. Agrawal, A. Sussman, and J. Saltz, "Compiler and Runtime Support for Structured and Block Structured Applications," Proc. Supercomputing '93, pp. 578-587.Los Alamitos, Calif.: IEEE CS Press, Nov. 1993.
[2] Ibraheem Al-furaih and Sanjay Ranka, “Memory Hierarchy Management for Iterative Graph Structures,” The Int'l Parallel Processing Symp., 1998.
[3] P. Banerjee et al., "The Paradigm Compiler for Distributed-Memory Multicomputers," Computer, Vol. 28, No. 10, Oct. 1995, pp. 37-47.
[4] U. Banerjee, Loop Transformations for Restructuring Compilers, Kluwer Academic Publishers, Boston, Mass., 1997.
[5] Stephen T. Barnard and Horst Simon, “A Fast Multilevel Implementation of Recursive Spectral Bisection for Partitioning Unstructured Problems,” Technical Report RNR-92-033, NASA Ames Research Center, Nov. 1992.
[6] R. Das,J. Saltz,, and R. v. Hanxleden,“Slicing analysis and indirect access to distributed arrays,” Proc. Sixth Workshop Languages and Compilers for Parallel Computing, pp. 152–168, Springer-Verlag, Aug. 1993.
[7] Chen Ding and Ken Kennedy, “Improving Cache Performance in Dynamic Applications through Data and Computation Reorganization at Run Time,” Proc. ACM SIGPLAN‘99 Conf. Programming Language Design and Implementation, pp. 229–241, May 1999.
[8] J. Dongarra,I.S. Duff,C.D. Sorensen,, and H.A. van der Vorst,Solving Linear Systems on Vector and Shared Memory Computers.Philadelphia: SIAM, 1991.
[9] Kalluri Eswar, P. Sadayappan, and Chua-Huang Huang, “Compile-time Characterization of Recurrent Patterns in Irregular Computations,” Proc. 22nd Int'l Conf. Parallel Processing, vol. 2, pp. 148–155, 1993.
[10] M. Gupta, “Automatic Data Partitioning on Distributed Memory Multicomputers,” doctoral thesis, Center for Reliable and High-Performance Computing, Univ. Illinois, Urbana-Champaign, Sept. 1992.
[11] R. von Hanxleden,K. Kennedy, C. Koelbel, R. Das, and J. Saltz, “Compiler Analysis for Irregular Problems in Fortran D,” Proc. Fifth Workshop Languages and Compilers for Parallel Computing, Aug. 1992.
[12] “High Performance Fortran Forum,” High Performance Fortran Language Specification, version 1.1, technical report, Center for Research on Parallel Computation, Rice Univ., Houston, Tex., Nov. 1994.
[13] R. Hockney and C. Jesshope, Parallel Computers: Architecture, Programming and Algorithms. Adam Hilger, 1981.
[14] Mark T. Jones and Paul E. Plassmann., “The Efficient Parallel Iterative Solution of Large Sparse Linear Systems,” Technical Report MCS-P314-0692, MCS-Argonne Nat'l Laboratory, Dec. 1992.
[15] P.M.W. Knijnenburg and H.A.G. Wijshoff, “On Improving Data Locality in Sparse Matrix Computations,” Technical Report 94-15, Department of Computer Science, Leiden Univ., 1994.
[16] C. Koelbel, D. Loveman, R. Schreiber, G. Steele Jr., and M. Zosel, The High Performance Fortran Handbook. MIT Press, 1994.
[17] S.R. Kohn and S.B. Baden, "A Robust Parallel Programming Model for Dynamic, Non-Uniform Scientific Computation," Proc. 1994 Scalable High Performance Computing Conf., pp. 509-517,Knoxville, Tenn., May 1994.
[18] A. Lain and P. Banerjee, “Techniques to Overlap Computation and Communication in Irregular Iterative Applications,” Proc. Eighth ACM Int'l Conf. Supercomputing, pp. 236–245, July 1994.
[19] A. Lain and P. Banerjee, “Exploiting Spatial Regularity in Irregular Iterative Applications,” Proc. Ninth Int’l Parallel Processing Symp., IEEE Press, Piscataway, N.J., 1995, pp. 820-827.
[20] A. Lain and P. Banerjee, “Compiler Support for Hybrid Irregular Accesses on Multicomputers,” Proc. 10th ACM Int'l Conf. Supercomputing 1996.
[21] A. Lain, "Compiler and Run-Time Support for Irregular Computations," PhD thesis CRHC-92-22, Dept. of Computer Science, Univ. of Illi nois, Urbana, Oct. 1995.
[22] J. Mellor-Crummey, D. Whalley, and K. Kennedy, “Improving Memory Hierarchy Performance for Irregular Applications,” The Int'l Conf. Supercomputing, pp. 425–433, June 1999.
[23] “Message-Passing Interface Forum,” Document for a Standard Message-Passing Interface, Version 1.0. 1994.
[24] E. Morel and C. Renvoise, "Global Optimization by Suppression of Partial Redundancies," Comm. ACM, vol. 22, no. 2, pp. 96-103, Feb. 1979.
[25] Andreas Muller and Roland Ruhl, “Extending High Performance Fortran for the Support of Unstructured Computations,” Proc. Ninth ACM Int'l Conf. Supercomputing, pp. 127–136, 1995.
[26] “Netlib,” Itpack, technical report,
[27] K.J. Ottenstein and L.M. Ottenstein, ”The Program Dependence Graph in a Software Development Environment,” ACM SIGPLAN Notices, vol. 19, no. 5, pp. 177–184, May 1984.
[28] Chao-Wei Ou, Manoj Gunwani, and Sanjay Ranka, “Architecture-Independent Locality-Improving Transformations of Computational Graphs Embedded in$k$-dimensions,” Proc Ninth ACM Int'l Conf. Supercomputing, pp. 289–297, July 1995.
[29] C.D. Polychronopoulos, M. Girkar, M.R. Haghighat, C.L. Lee, B. Leung, and D. Schouten, “Parafrase-2: An Environment for Parallelizing, Partitioning, Synchronizing, and Scheduling Programs on Multiprocessors,” Proc. 18th Int'l Conf. Parallel Processing, pp. II:39–48, Aug. 1989.
[30] R. Ponnusamy,J. Saltz,, and A. Choudhary,“Runtime-compilation techniques for data partitioning and communication schedule reuse,” Proc. Supercomputing’93, pp. 361-370.Los Alamitos, Calif.: IEEE CS Press, Nov. 1993. Also available as Univ. ofMaryland Technical Report CS-TR-3055 and UMIACS-TR-93-32.
[31] Roland Ruhl, A Parallelizing Compiler for Distributed Memory Parallel Processors, PhD thesis, ETH, Zurich, 1994.
[32] Youcef Saad, SPARSKIT: A Basic Tool Kit for Sparse Matrix Computations. RIACS.
[33] J. Saltz,K. Crowley,R. Mirchandaney,, and H. Berryman,“Run-time scheduling and execution of loops on message passing machines,” J. Parallel and Distributed Computing, vol. 8, pp. 303–312, 1990.
[34] Horst D. Simon, ed., Parallel Computational Fluid Dynamics. MIT Press, 1992.
[35] Frank Tip, “A Survey of Program Slicing Techniques,” Technical Report CS-R9438, CWI, Amsterdam, 1994.
[36] R. von Hanxleden and K. Kennedy, "Give-N-Take—A Balanced Code Placement Framework," Proc. SIGPLAN '94 Conf. Programming Language Design and Implementation, pp. 107-120. ACM Press, June 1994.
[37] Reinhard von Hanxleden, “Compiler Support for Machine-Independent Parallelization of Irregular Problems,” PhD thesis, Rice Univ., Dec. 1994.
[38] Yang Zeng and Santosh G. Abraham, “Partitioning Regular Grid Applications with Irregular Boundaries for Cache-Coherent Multiprocessors” Proc. Ninth Int'l Parallel Processing Symp., pp. 222–228, Apr. 1995.

Index Terms:
Irregular applications, iterative, runtime support, compiler support, distributed memory multicomputers, runtime compilation.
Antonio Lain, Dhruva R. Chakrabarti, Prithviraj Banerjee, "Compiler and Run-Time Support for Exploiting Regularity within Irregular Applications," IEEE Transactions on Parallel and Distributed Systems, vol. 11, no. 2, pp. 119-135, Feb. 2000, doi:10.1109/71.841749
Usage of this product signifies your acceptance of the Terms of Use.