International Parallel and Distributed Processing Symposium (IPDPS'03)
Compiler and Runtime Support for Running OpenMP Programs on Pentium- and Itanium-Architectures
Nice, France
April 22-April 26
ISBN: 0-7695-1926-1
Exploiting Thread-Level Parallelism (TLP) is a promising way to improve the performance of applications with the advent of general-purpose cost effective uni-processor and shared-memory multiprocessor systems. In this paper, we describe the OpenMP implementation in the Intel® C++ and Fortran compiler for Intel architectures. We present our major design consideration and decisions in the Intel compiler for generating efficient multithreaded codes guided by OpenMP directives and pragmas. We describe several transformation phases in the compiler for the OpenMP * parallelization. In addition to compiler support, the OpenMP runtime library is a critical part of the Intel compiler. We present runtime techniques developed in the Intel OpenMP runtime library for exploiting thread-level parallelism as well as integrating the OpenMP support with other forms of threading termed as sibling parallelism. The performance results of a set of benchmarks show a good speedup over well-optimized serial code performance on Intel® Pentium® and Itanium® processor-based systems.
Index Terms:
Parallelization, Hyper-Threading technology, OpenMP, compiler optimization, thread-level parallelism, shared-memory multiprocessor
Citation:
Xinmin Tian, Milind Girkar, Sanjiv Shah, Douglas Armstrong, Ernesto Su, Paul Petersen, "Compiler and Runtime Support for Running OpenMP Programs on Pentium- and Itanium-Architectures," ipdps, pp.130a, International Parallel and Distributed Processing Symposium (IPDPS'03), 2003