This Article 
 Bibliographic References 
 Add to: 
Efficient Parallel Execution of Irregular Recursive Programs
February 2002 (vol. 13 no. 2)
pp. 167-178

Abstract—Programs whose parallelism stems from multiple recursion form an interesting subclass of parallel programs with many practical applications. The highly irregular shape of many recursion trees makes it difficult to obtain good load balancing with small overhead. We present a system, called REAPAR, that executes recursive C programs in parallel on SMP machines. Based on data from a single profiling run of the program, REAPAR selects a load-balancing strategy that is both effective and efficient and it generates parallel code implementing that strategy. The performance obtained by REAPAR on a diverse set of benchmarks matches that published for much more complex systems requiring high-level problem-oriented explicitly parallel constructs. A case study even found REAPAR to be competitive to handwritten (low-level, machine-oriented) thread-parallel code.

[1] J.E. Barnes and P. Hut, “A Hierarchical$O(N \log N$) Force Calculation Algorithm,” Nature, vol. 324, p. 446, 1986.
[2] K.E. Batcher, “Bitonic Sorting,” Technical Report GER-11759, Goodyear Aerospace Report, 1964.
[3] G.E. Blelloch, “Programming Parallel Algorithms,” Comm. ACM, vol. 39, no. 3, pp. 85-97, Mar. 1996.
[4] J.-P. Briot, R. Guerraoui, and K.-P. Löhr., “Concurrency and Distribution in Object-Oriented Programming,” ACM Computing Surveys, 1998.
[5] ——,“Retire Fortran? A debate rekindled,”Commun. ACM, vol. 35, no. 8, pp. 81–89, Aug. 1992.
[6] M.C. Carlisle, "Olden: Parallelizing Programs with Dynamic Data Structures on Distributed-Memory Machines," PhD thesis, Dept. of Computer Science, Princeton Univ., June 1996.
[7] S. Chakrabarti, A. Ranade, and K. Yelick, “Randomized Load Balancing for Tree-Structured Computation,” Proc. Scalable High Performance Computing Conf., pp. 666-673, 1994.
[8] D.E. Culler, A. Dusseau, S.C. Goldstein, A. Krishnamurthy, S. Lumetta, T. von Eicken, and K. Yelick, "Parallel Programming in Split-C," Supercomputing, 1993.
[9] T. Erlebach, APRIL 1.0 User Manual—Automatic Parallelization of Divide and Conquer Algorithms. Technische Universität München, 1995.
[10] T. Fahringer, “Using the P3T to Guide the Parallelization and Optimization Effort under the Vienna Fortran Compilation System,” Technical Report TR 94-5, Univ. of Vienna, Apr. 1994.
[11] High Performance Fortran Forum, High Performance Fortran Language Specification 2.0. Jan. 1997.
[12] S. Gregory, Parallel Logic Programming in PARLOG, The Language and Its Implementation. Reading, Mass.: Addison-Wesley, 1987.
[13] S.U. Hänßgen, “Effiziente parallele Ausführung irregulärer rekursiver Programme,” PhD thesis, Universität Karlsruhe, Fakultät für Informatik, Apr. 1998.
[14] S.U. Hänßgen, “REAPAR User Manual and Reference: Automatic Parallelization of Irregular Recursive Programs,” Technical Report 8/98, Universität Karlsruhe, Fakultät für Informatik, Mar. 1998,
[15] S. Hiranandani, K. Kennedy, and C.-W. Tseng, "Compiling Fortran D for MIMD Distributed-Memory Machines," Comm. ACM, vol. 35, no. 8, pp. 66-80, Aug. 1992.
[16] U. Hölzle and D. Ungar, “Optimizing Dynamically-Dispatched Calls with Run-Time Type Feedback,” Proc. ACM SIGPLAN `94 Conf. Programming Language Design and Implementation, June 1994.
[17] J.P. i Silvestre and T. Römke., “Programming Frames for the Efficient Use of Parallel Systems,” Technical Report PC2Technical Report TR-001-97, Paderborn Center for Parallel Computing, Jan. 1997. Submitted to EUROPAR '97.
[18] C.F. Joerg, “The Cilk System for Parallel Multithreaded Computing,” PhD thesis, Dept. of Electrical Engineering and Computer Science, Massachusetts Institute of Tech nology, Jan. 1996.
[19] M. Philippsen, “Automatic Alignment of Array Data and Processes to Reduce Communication Time on DMPPs,” Proc. Fifth ACM SIGPLAN Symp. Principles and Practice of Parallel Programming (PPoPP), pp. 156-165, July 1995.
[20] M. Philippsen, “Imperative Concurrent Object-Oriented Languages,” Technical Report TR-95/50, International Computer Science Institute, Univ. of California, Berkeley, Aug. 1995.
[21] A.J. Piper, “Object-Oriented Divide-and-Conquer for Parallel Processing,” PhD thesis, Emmanuel College, Cambridge, England, July 1994.
[22] Cilk 5.0 (Beta 1) Reference Manual. Cambridge, Mass.: Supercomputing Technologies Group, MIT Laboratory for Computer Science, Mar. 1997.
[23] Parallel Functional Languages and Compilers, B.K. Szymanski, ed., New York: Frontier Series, ACM Press, 1991.
[24] M. Wolfe,“Optimizing Supercompilers For Supercomputers.”Cambridge, MA: MIT, 1989.

Index Terms:
Granularity control, irregular problems, recursion, instrumentation, profiling, SMP, benchmarks.
Lutz Prechelt, Stefan U. Hänßgen, "Efficient Parallel Execution of Irregular Recursive Programs," IEEE Transactions on Parallel and Distributed Systems, vol. 13, no. 2, pp. 167-178, Feb. 2002, doi:10.1109/71.983944
Usage of this product signifies your acceptance of the Terms of Use.