• Publication
  • PrePrints
  • Abstract - Orchestrating an Ensemble of MapReduce Jobs for Minimizing Their Makespan
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Orchestrating an Ensemble of MapReduce Jobs for Minimizing Their Makespan
PrePrint
ISSN: 1545-5971
Abhishek Verma, University of Illinois at Urbana-Champaign, Urbana
Ludmila Cherkasova, Hewlett-Packard Labs, Palo Alto
Roy H. Campbell, University of Illinois at Urbana-Champaign, Urbana
A key challenge in MapReduce environments is to increase the utilization of MapReduce clusters to minimize their cost. For a set of production jobs that are executed periodically on new data, we can perform an off-line analysis for evaluating performance benefits of different optimization techniques. In this work, we consider a subset of production workloads that consists of MapReduce jobs with no dependencies. We observe that the order in which these jobs are executed can have a significant impact on their overall completion time and the cluster resource utilization. We evaluate the performance benefits of the constructed schedule through an extensive set of simulations over a variety of realistic workloads. The results are workload and cluster-size dependent, but it is typical to achieve up to 10%-25% of makespan improvements by simply processing the jobs in the right order. However, in some cases, the simplified abstraction assumed by Johnson's algorithm may lead to a suboptimal job schedule. We design a novel heuristic, called BalancedPools, that significantly improves Johnson's schedule results (up to 15%-38%), exactly in the situations when it produces suboptimal makespan. Overall, we observe up to 50% in the makespan improvements with the new BalancedPools algorithm.
Index Terms:
Performance attributes,Computer Systems Organization,Performance of Systems,Modeling techniques
Citation:
Abhishek Verma, Ludmila Cherkasova, Roy H. Campbell, "Orchestrating an Ensemble of MapReduce Jobs for Minimizing Their Makespan," IEEE Transactions on Dependable and Secure Computing, 14 Feb. 2013. IEEE computer Society Digital Library. IEEE Computer Society, <http://doi.ieeecomputersociety.org/10.1109/TDSC.2013.14>
Usage of this product signifies your acceptance of the Terms of Use.