loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
13th International Conference on Parallel and Distributed Systems - Volume 2 (ICPADS'07)
Virtualization aware job schedulers for checkpoint-restart
Hsinchu, Taiwan
December 05-December 07
ISBN: 978-1-4244-1889-3
R. Badrinath, Hewlett-Packard, USA
R. Krishnakumar, Hewlett-Packard, USA
R.K. Palanivel Rajan, Hewlett-Packard, USA
Application checkpoint and restart has been a widely studied problem over the last several decades. Despite immense volume of theory and several research project level implementations, there is very little by way of working solutions for the case of parallel distributed applications (such as MPI programs on a cluster). We describe our experiences in enhancing a job scheduler to leverage mechanisms of a virtual machine environment to support checkpoint-restart. We also describe the basic coordinated checkpoint-restart framework that we implemented on which this solution is based.
Citation:
R. Badrinath, R. Krishnakumar, R.K. Palanivel Rajan, "Virtualization aware job schedulers for checkpoint-restart," icpads, vol. 2, pp.1-7, 13th International Conference on Parallel and Distributed Systems - Volume 2 (ICPADS'07), 2007
Usage of this product signifies your acceptance of the Terms of Use.