Parallel and Distributed Processing Symposium, International (2006)
Rhodes Island, Greece
Apr. 25, 2006 to Apr. 29, 2006
G.R. Gao , Dept. of Electr.&Comput. Eng., Delaware Univ., USA
This paper addresses the underlying sources of performance degradation (e.g. latency, overhead, and starvation) and the difficulties of programmer productivity (e.g. explicit locality management and scheduling, performance tuning, fragmented memory, and synchronous global barriers) to dramatically enhance the broad effectiveness of parallel processing for high end computing. We are developing a hierarchical threaded virtual machine (HTVM) that defines a dynamic, multithreaded execution model and programming model, providing an architecture abstraction for HEC system software and tools development. We are working on a prototype language, LITL-X (pronounced "little-X") for latency intrinsic-tolerant language, which provides the application programmers with a powerful set of semantic constructs to organize parallel computations in a way that hides/manages latency and limits the effects of overhead. This is quite different from locality management, although the intent of both strategies is to minimize the effect of latency on the efficiency of computation. We work on a dynamic compilation and runtime model to achieve efficient LITL-X program execution. Several adaptive optimizations were studied. A methodology of incorporating domain-specific knowledge in program optimization was studied. Finally, we plan to implement our method in an experimental testbed for a HEC architecture and perform a qualitative and quantitative evaluation on selected applications.
program adaptive optimization, hierarchical multithreading, programming model, system software, performance degradation, overhead effect, programmer productivity, explicit locality management, performance tuning, fragmented memory, synchronous global barrier, parallel processing, high end computing, hierarchical threaded virtual machine, architecture abstraction, prototype language, LITL-X, latency intrinsic-tolerant language, dynamic compilation, runtime model, domain-specific knowledge
G. Gao, R. Stevens, M. Hereld, Weirong Zhu and T. Sterling, "Hierarchical multithreading: programming model and system software," Parallel and Distributed Processing Symposium, International(IPDPS), Rhodes Island, Greece, 2006, pp. 317.