loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 18
Monitoring and Debugging Parallel Software with BCS-MPI on Large-Scale Clusters
Denver, Colorado
April 04-April 08
ISBN: 0-7695-2312-9
Juan Fern?ndez, Universidad de Murcia, Spain
Fabrizio Petrini, Los Alamos National Laboratory, NM
Eitan Frachtenberg, Los Alamos National Laboratory, NM
Buffered CoScheduled (BCS) MPI is a novel implementation of MPI based on global synchronization of all system activities. BCS-MPI imposes a model where all processes and their communication are tightly scheduled at a very fine granularity. Thus, BCS-MPI provides a system that is much more controllable and deterministic. BCS-MPI leverages this regular behavior to provide a simple yet powerful monitoring and debugging subsystem that streamlines the analysis of parallel software. This subsystem, called Monitoring and Debugging System (MDS), provides exhaustive process and communication scheduling statistics. This paper covers in detail the design and implementation of the MDS subsystem, and demonstrates how the MDS can be used to monitor and debug not only parallel MPI applications but also the BCS-MPI runtime system itself. Additionally, we show that this functionality need not come at a significant performance loss.
Citation:
Juan Fern?ndez, Fabrizio Petrini, Eitan Frachtenberg, "Monitoring and Debugging Parallel Software with BCS-MPI on Large-Scale Clusters," ipdps, vol. 19, pp.300a, 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 18, 2005
Usage of this product signifies your acceptance of the Terms of Use.