Cluster Computing and the Grid, IEEE International Symposium on (2011)
Newport Beach, California USA
May 23, 2011 to May 26, 2011
Programming languages have advanced tremendously over the years, but program debuggers have hardly changed. Sequential debuggers do little more than allow a user to control the flow of a program and examine its state. Parallel ones support the same operations on multiple processes, which are adequate with a small number of processors, but become unwieldy and ineffective on very large machines. Typical scientific codes have enormous multi-dimensional data structures and it is impractical to expect a user to view the data using traditional display techniques. In this paper we discuss the use of debug-time assertions, and show that these can be used to debug parallel programs. The techniques reduce the debugging complexity because they reason about the state of large arrays without requiring the user to know the expected value of every element. Assertions can be expensive to evaluate, but their performance can be improved by running them in parallel. We demonstrate the system with a case study finding errors in a parallel version of the Shallow Water Equations, and evaluate the performance of the tool on a 4,096 cores Cray XE6.
parallel debugger, assertion, Guard, MPI
Donny Kurniawan, Minh Ngoc Dinh, Luiz DeRose, Chao Jin, David Abramson, Bob Moench, "Assertion Based Parallel Debugging", Cluster Computing and the Grid, IEEE International Symposium on, vol. 00, no. , pp. 63-72, 2011, doi:10.1109/CCGrid.2011.44