|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
| ASCII Text | x | ||
| Abhishek Verma, Ludmila Cherkasova, Roy H. Campbell, "Orchestrating an Ensemble of MapReduce Jobs for Minimizing Their Makespan," IEEE Transactions on Dependable and Secure Computing, vol. 99, no. 1, pp. 1, , 5555. | |||
| BibTex | x | ||
| @article{ 10.1109/TDSC.2013.14, author = {Abhishek Verma and Ludmila Cherkasova and Roy H. Campbell}, title = {Orchestrating an Ensemble of MapReduce Jobs for Minimizing Their Makespan}, journal ={IEEE Transactions on Dependable and Secure Computing}, volume = {99}, number = {1}, issn = {1545-5971}, year = {5555}, pages = {1}, doi = {http://doi.ieeecomputersociety.org/10.1109/TDSC.2013.14}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - JOUR JO - IEEE Transactions on Dependable and Secure Computing TI - Orchestrating an Ensemble of MapReduce Jobs for Minimizing Their Makespan IS - 1 SN - 1545-5971 SP EP EPD - 1 A1 - Abhishek Verma, A1 - Ludmila Cherkasova, A1 - Roy H. Campbell, PY - 5555 KW - Performance attributes KW - Computer Systems Organization KW - Performance of Systems KW - Modeling techniques VL - 99 JA - IEEE Transactions on Dependable and Secure Computing ER - | |||
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TDSC.2013.14
A key challenge in MapReduce environments is to increase the utilization of MapReduce clusters to minimize their cost. For a set of production jobs that are executed periodically on new data, we can perform an off-line analysis for evaluating performance benefits of different optimization techniques. In this work, we consider a subset of production workloads that consists of MapReduce jobs with no dependencies. We observe that the order in which these jobs are executed can have a significant impact on their overall completion time and the cluster resource utilization. We evaluate the performance benefits of the constructed schedule through an extensive set of simulations over a variety of realistic workloads. The results are workload and cluster-size dependent, but it is typical to achieve up to 10%-25% of makespan improvements by simply processing the jobs in the right order. However, in some cases, the simplified abstraction assumed by Johnson's algorithm may lead to a suboptimal job schedule. We design a novel heuristic, called BalancedPools, that significantly improves Johnson's schedule results (up to 15%-38%), exactly in the situations when it produces suboptimal makespan. Overall, we observe up to 50% in the makespan improvements with the new BalancedPools algorithm.
Index Terms:
Performance attributes,Computer Systems Organization,Performance of Systems,Modeling techniques
Citation:
Abhishek Verma, Ludmila Cherkasova, Roy H. Campbell, "Orchestrating an Ensemble of MapReduce Jobs for Minimizing Their Makespan," IEEE Transactions on Dependable and Secure Computing, 14 Feb. 2013. IEEE computer Society Digital Library. IEEE Computer Society, <http://doi.ieeecomputersociety.org/10.1109/TDSC.2013.14>
Usage of this product signifies your acceptance of the Terms of Use.

