loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Fourth IEEE International Conference on Cluster Computing (CLUSTER'02)
An Architecture for Integrated Resource Management of MPI Jobs
Chicago, Illinois
September 23-September 26
ISBN: 0-7695-1745-5
Steve Sistare, Sun Microsystems, Inc.
Jack Test, Sun Microsystems, Inc.
Dave Plauger, Sun Microsystems, Inc.
We present a new architecture for the integration of distributed resource management systems and parallel run-time environments such as MPI. The architecture solves the long-standing problem of achieving a tight integration between the two in a clean and robust manner that fully enables the functionality of both systems, including resource limit enforcement and accounting. We also present a more uniform command interface to the user, which simplifies the task of running parallel jobs and tools under a resource manager. The architecture is extensible and allows new systems to be incorporated. We describe the properties that a resource management system must have to work in this architecture, and find that these are ubiquitous in the resource management world. Using the Sun™ Cluster Runtime Environment, we show the generality of the approach by implementing tight integrations with PBS, LSF, and Sun Grid Engine software, and we demonstrate the advantages of a tight integration. No modifications or enhancements to these resource management systems were required, which is in marked contrast to ad-hoc approaches which typically require such changes.
Citation:
Steve Sistare, Jack Test, Dave Plauger, "An Architecture for Integrated Resource Management of MPI Jobs," cluster, pp.370, Fourth IEEE International Conference on Cluster Computing (CLUSTER'02), 2002
Usage of this product signifies your acceptance of the Terms of Use.