The Community for Technology Leaders
Parallel and Distributed Processing Symposium, International (2008)
Miami, FL, USA
Apr. 14, 2008 to Apr. 18, 2008
ISBN: 978-1-4244-1693-6
pp: 1-8
Mackale Joyner , Department of Computer Science, Rice University, USA
Vivek Sarkar , Department of Computer Science, Rice University, USA
Rui Zhang , Department of Computer Science, Rice University, USA
Zoran Budimlic , Department of Computer Science, Rice University, USA
ABSTRACT
This paper presents an interprocedural rank analysis algorithm to automatically infer ranks of arrays in X10, a language that supports rank-independent specification of loop and array computations using regions and points. We use the rank analysis information to enable storage transformations on arrays. We evaluate a transformation that converts high-level multidimensional X10 arrays into lower-level multidimensional Java arrays, when legal to do so. Preliminary performance results for a set of parallel computational benchmarks on a 64-way AIX Power5+ SMP machine show that our optimizations deliver performance that rivals the performance of lower-level, hand-tuned code with explicit loops and array accesses, and up to two orders of magnitude faster than unoptimized, high-level X10 programs. The results show that our optimizations also help improve the scalability of X10 programs by demonstrating that relative performance improvements over the unoptimized versions increase as we scale the parallelism from 1 CPU to 64 CPUs.
INDEX TERMS
CITATION
Mackale Joyner, Vivek Sarkar, Rui Zhang, Zoran Budimlic, "Array optimizations for parallel implementations of high productivity languages", Parallel and Distributed Processing Symposium, International, vol. 00, no. , pp. 1-8, 2008, doi:10.1109/IPDPS.2008.4536185
321 ms
(Ver 3.3 (11022016))