loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID'06)
A Failure-Aware Scheduling Strategy in Large-Scale Cluster System
Singapore
May 16-May 19
ISBN: 0-7695-2585-7
Wu Linping, Chinese Academy of Sciences, China
Meng Dan, Chinese Academy of Sciences, China
Jianfeng Zhan, Chinese Academy of Sciences, China
Wang Lei, Chinese Academy of Sciences, China
Tu Bibo, Chinese Academy of Sciences, China
As the scale is expanding, node failure becomes a commonplace feature of large-scale cluster systems. As an important part of cluster operating system software, job scheduling takes charge with high efficient resource management and reasonable job scheduling. The function of job scheduling in cluster is divided into two sub-parts: job selection and node allocation. In this paper, we introduce a failure-aware scheduling strategy named LUNF (Longest Uptime Node First) node allocation policy using characterization of nodes' failure. Simulation results show that LUNF policy do better than random node allocation policy for the system performance.
Citation:
Wu Linping, Meng Dan, Jianfeng Zhan, Wang Lei, Tu Bibo, "A Failure-Aware Scheduling Strategy in Large-Scale Cluster System," ccgrid, pp.645-648, Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID'06), 2006
Usage of this product signifies your acceptance of the Terms of Use.