This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Guarantee Strict Fairness and UtilizePrediction Better in Parallel Job Scheduling
April 2014 (vol. 25 no. 4)
pp. 971-981
Keqin Li, Dept. of Comput. Sci., State Univ. of New York, New Paltz, NY, USA
Weimin Zheng, Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing, China
Yongwei Wu, Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing, China
Yulai Yuan, Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing, China
As the most widely used parallel job scheduling strategy, EASY backfilling achieved great success, not only because it can balance fairness and performance, but also because it is universally applicable to most HPC systems. However, unfairness still exists in EASY. Our simulation shows that a blocked job can be delayed by later jobs for more than 90 hours on real workloads. Additionally, directly employing runtime prediction techniques in EASY would lead to a serious situation called reservation violation. In this paper, we aim at guaranteeing strict fairness (no job is delayed by any jobs of lower priority) while achieving attractive performance, and employing prediction without causing reservation violation in parallel job scheduling. We propose two novel strategies, namely, shadow load preemption (SLP) and venture backfilling (VB), which are integrated into EASY to construct preemptive venture EASY backfilling (PV-EASY). Experimental results on three real HPC workloads demonstrate that PV-EASY is more attractive than EASY in parallel job scheduling, from both academic and industry perspectives.
Index Terms:
Runtime,Processor scheduling,Job shop scheduling,Delays,Program processors,virtualization,Checkpoints,modeling and prediction,parallel system,scheduling
Citation:
Keqin Li, Weimin Zheng, Yongwei Wu, Yulai Yuan, "Guarantee Strict Fairness and UtilizePrediction Better in Parallel Job Scheduling," IEEE Transactions on Parallel and Distributed Systems, vol. 25, no. 4, pp. 971-981, April 2014, doi:10.1109/TPDS.2013.88
Usage of this product signifies your acceptance of the Terms of Use.