This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Dynamic Fractional Resource Scheduling versus Batch Scheduling
March 2012 (vol. 23 no. 3)
pp. 521-529
Mark Stillwell, INRIA, the University of Lyon, and the LIP laboratory of ENS Lyon, Lyon
Frédéric Vivien, INRIA, the University of Lyon, and the LIP laboratory of ENS Lyon, Lyon
Henri Casanova, University of Hawaii at Manoa, Honolulu
We propose a novel job scheduling approach for homogeneous cluster computing platforms. Its key feature is the use of virtual machine technology to share fractional node resources in a precise and controlled manner. Other VM-based scheduling approaches have focused primarily on technical issues or extensions to existing batch scheduling systems, while we take a more aggressive approach and seek to find heuristics that maximize an objective metric correlated with job performance. We derive absolute performance bounds and develop algorithms for the online nonclairvoyant version of our scheduling problem. We further evaluate these algorithms in simulation against both synthetic and real-world HPC workloads and compare our algorithms to standard batch scheduling approaches. We find that our approach improves over batch scheduling by orders of magnitude in terms of job stretch, while leading to comparable or better resource utilization. Our results demonstrate that virtualization technology coupled with lightweight online scheduling strategies can afford dramatic improvements in performance for executing HPC workloads.

[1] D.G. Feitelson "Parallel Workloads Archive," http://www.cs.huji. ac.il/labs/parallelworkload /, 2005.
[2] S.K. Setia, M.S. Squillante, and V.K. Naik, "The Impact of Job Memory Requirements on Gang-Scheduling Performance," ACM SIGMETRICS Performance Evaluation Review, vol. 26, no. 4, pp. 30-39, 1999.
[3] C.B. Lee and A.E. Snavely, "Precise and Realistic Utility Functions for User-Centric Performance Analysis of Schedulers," Proc. 16th Int'l Symp. High Performance Distributed Computing (HPDC), pp. 107-116, 2007.
[4] M.A. Bender, S. Muthukrishnan, and R. Rajaraman, "Approximation Algorithms for Average Stretch Scheduling," J. Scheduling, vol. 7, no. 3, pp. 195-222, 2004.
[5] S. Srinivasan, R. Kettimuthu, V. Subramani, and P. Sadayappan, "Characterization of Backfilling Strategies for Parallel Job Scheduling," Proc. 31st Int'l Conf. Parallel Processing Workshops (ICPP Workshops), pp. 514-522, 2002.
[6] C.B. Lee and A.E. Snavely, "On the User-Scheduler Dialogue: Studies of User-Provided Runtime Estimates and Utility Functions," Int'l J. High Performance Computing Applications, vol. 20, no. 4, pp. 495-506, 2006.
[7] N. Bhatia and J.S. Vetter, "Virtual Cluster Management with Xen," Proc. Conf. Parallel Processing (Euro-Par '07), pp. 185-194, 2007.
[8] B. Sotomayor, K. Keahey, and I. Foster, "Combining Batch Execution and Leasing Using Virtual Machines," Proc. 17th Int'l Symp. High Performance Distributed Computing (HPDC), pp. 87-96, 2008.
[9] D.G. Feitelson, L. Rudolph, U. Schwiegelshohn, K.C. Sevcik, and P. Wong, "Theory and Practice in Parallel Job Scheduling," Proc. Job Scheduling Strategies for Parallel Processing Conf. (JSSPP), pp. 1-34, 1997.
[10] M.A. Bender, S. Chakrabarti, and S. Muthukrishnan, "Flow and Stretch Metrics for Scheduling Continuous Job Streams," Proc. Ninth Ann. ACM-SIAM Symp. Discrete Algorithms (SODA), pp. 270-279, 1998.
[11] A. Legrand, A. Su, and F. Vivien, "Minimizing the Stretch When Scheduling Flows of Divisible Requests," J. Scheduling, vol. 11, no. 5, pp. 381-404, 2008.
[12] M. Stillwell, D. Schanzenbach, F. Vivien, and H. Casanova, "Resource Allocation Algorithms for Virtualized Service Hosting Platforms," J. Parallel and Distributed Computing, vol. 70, no. 9, pp. 962-974, 2010.
[13] M. Stillwell, D. Schanzenbach, F. Vivien, and H. Casanova, "Resource Allocation Using Virtual Clusters," Proc. IEEE/ACM Ninth Int'l Symp. Cluster Computing and the Grid (CCGrid), pp. 260-267, 2009.
[14] W.J. Leinberger, G. Karypis, and V. Kumar, "Multi-capacity Bin Packing Algorithms with Applications to Job Scheduling under Multiple Constraints," Proc. Int'l Conf. Parallel Processing (ICPP), pp. 404-412, 1999.
[15] D.P. Bertsekas and R. Gallager, Data Networks, second ed. Prentice Hall, 1992.
[16] U. Lublin and D.G. Feitelson, "The Workload on Parallel Supercomputers: Modeling the Characteristics of Rigid Jobs," J. Parallel and Distributed Computing, vol. 63, no. 11, pp. 1105-1122, 2003.
[17] E. Frachtenberg and D.G. Feitelson, "Pitfalls in Parallel Job Scheduling Evaluation," Proc. 11th Workshop Job Scheduling Strategies for Parallel Processing (JSSPP), pp. 257-282, 2005.

Index Terms:
Cluster, scheduler, virtual machine, vector bin packing, high-performance computing, batch scheduling.
Citation:
Mark Stillwell, Frédéric Vivien, Henri Casanova, "Dynamic Fractional Resource Scheduling versus Batch Scheduling," IEEE Transactions on Parallel and Distributed Systems, vol. 23, no. 3, pp. 521-529, March 2012, doi:10.1109/TPDS.2011.183
Usage of this product signifies your acceptance of the Terms of Use.