2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid) (2016)
May 16, 2016 to May 19, 2016
The high performance computing (HPC) scheduling landscape is changing. Increasingly, there are large scientific computations that include high-throughput, data-intensive, and stream-processing compute models. These jobs increase the workload heterogeneity, which presents challenges for classical tightly coupled MPI job oriented HPC schedulers. Thus, it is important to define new analyses methods to understand the heterogeneity of the workload, and its possible effect on the performance of current systems. In this paper, we present a methodology to assess the job heterogeneity in workloads and scheduling queues. We apply the method on the workloads of three current National Energy Research Scientific Computing Center (NERSC) systems in 2014. Finally, we present the results of such analysis, with an observation that heterogeneity might reduce predictability in the jobs' wait time.
Geometry, Queueing analysis, Clocks, Torque, Processor scheduling, Scheduling, Runtime
Gonzalo Pedro Rodrigo Alvarez, Per-Olov Ostberg, Erik Elmroth, Katie Antypas, Richard Gerber, Lavanya Ramakrishnan, "Towards Understanding Job Heterogeneity in HPC: A NERSC Case Study", 2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), vol. 00, no. , pp. 521-526, 2016, doi:10.1109/CCGrid.2016.32