2011 International Conference on Parallel Architectures and Compilation Techniques (2011)
Galveston, Texas USA
Oct. 10, 2011 to Oct. 14, 2011
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/PACT.2011.64
We propose a dynamic verification approach for large-scale message passing programs to locate correctness bugs caused by unforeseen nondeterministic interactions. This approach hinges on an efficient protocol to track the causality between nondeterministic message receive operations and potentially matching send operations. We show that causality tracking protocols that rely solely on logical clocks fail to capture all nuances of MPI program behavior, including the variety of ways in which nonblocking calls can complete. Our approach is hinged on formally defining the matches-before relation underlying the MPI standard, and devising lazy update logical clock based algorithms that can correctly discover all potential outcomes of nondeterministic receives in practice. can achieve the same coverage as a vector clock based algorithm while maintaining good scalability. LLCP allows us to analyze realistic MPI programs involving a thousand MPI processes, incurring only modest overheads in terms of communication bandwidth, latency, and memory consumption.
MPI verification, MPI debugging, causality tracking, dynamic verification
B. R. de Supinski, G. Gopalakrishnan, M. Schulz, G. Bronevetsky, R. M. Kirby and A. Vo, "Large Scale Verification of MPI Programs Using Lamport Clocks with Lazy Update," 2011 International Conference on Parallel Architectures and Compilation Techniques(PACT), Galveston, Texas USA, 2011, pp. 330-339.