2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID) (2018)
Washington, DC, USA
May 1, 2018 to May 4, 2018
Large scale computing solutions are increasingly used in the context of Big Data platforms, where efficient scheduling algorithms play an important role in providing optimized cluster resource utilization, throughput and fairness. This paper deals with the problem of scheduling a set of jobs across a cluster of machines handling the specific use case of fair scheduling for jobs and machines with heterogeneous characteristics. Although job and cluster diversity is unprecedented, most schedulers do not provide implementations that handle multiple resource type fairness in a heterogeneous system. We propose in this paper a new scheduler called h-Fair that selects jobs for scheduling based on a global dominant resource fairness heterogeneous policy, and dispatches them on machines with similar characteristics to the resource demands using the cosine similarity. We implemented h-Fair in Apache Hadoop YARN and we compare it with the existing Fair Scheduler that uses the dominant resource fairness policy based on the Google workload trace. We show that our implementation provides better cluster resource utilization and allocates more containers when jobs and machines have heterogeneous characteristics.
Big Data, computer centres, pattern clustering, resource allocation, scheduling
A. V. Postoaca, F. Pop and R. Prodan, "h-Fair: Asymptotic Scheduling of Heavy Workloads in Heterogeneous Data Centers," 2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), Washington, DC, USA, 2018, pp. 366-369.