2012 IEEE International Symposium on Performance Analysis of Systems & Software (2012)
New Brunswick, NJ, USA
Apr. 1, 2012 to Apr. 3, 2012
Naila Farooqui , College of Computing, Georgia Institute of Technology, USA
Andrew Kerr , School of Electrical and Computer Engineering, Georgia Institute of Technology, USA
Greg Eisenhauer , College of Computing, Georgia Institute of Technology, USA
Karsten Schwan , College of Computing, Georgia Institute of Technology, USA
Sudhakar Yalamanchili , School of Electrical and Computer Engineering, Georgia Institute of Technology, USA
As parallel execution platforms continue to proliferate, there is a growing need for real-time introspection tools to provide insight into platform behavior for performance debugging, correctness checks, and to drive effective resource management schemes. To address this need, we present the Lynx dynamic instrumentation system. Lynx provides the capability to write instrumentation routines that are (1) selective, instrumenting only what is needed, (2) transparent, without changes to the applications' source code, (3) customizable, and (4) efficient. Lynx is embedded into the broader GPU Ocelot system, which provides run-time code generation of CUDA programs for heterogeneous architectures. This paper describes (1) the Lynx framework and implementation, (2) its language constructs geared to the Single Instruction Multiple Data (SIMD) model of data-parallel programming used in current general-purpose GPU (GPGPU) based systems, and (3) useful performance metrics described via Lynx's instrumentation language that provide insights into the design of effective instrumentation routines for GPGPU systems. The paper concludes with a comparative analysis of Lynx with existing GPU profiling tools and a quantitative assessment of Lynx's instrumentation performance, providing insights into optimization opportunities for running instrumented GPU kernels.
S. Yalamanchili, N. Farooqui, G. Eisenhauer, A. Kerr and K. Schwan, "Lynx: A dynamic instrumentation system for data-parallel applications on GPGPU architectures," 2012 IEEE International Symposium on Performance Analysis of Systems & Software(ISPASS), New Brunswick, NJ, USA, 2012, pp. 58-67.