Issue No.05 - Sept.-Oct. (2013 vol.10)
pp: 314-327
Abhishek Verma , University of Illinois at Urbana-Champaign, Urbana
Ludmila Cherkasova , Hewlett-Packard Labs, Palo Alto
Roy H. Campbell , University of Illinois at Urbana-Champaign, Urbana
Cloud computing offers an attractive option for businesses to rent a suitable size MapReduce cluster, consume resources as a service, and pay only for resources that were consumed. A key challenge in such environments is to increase the utilization of MapReduce clusters to minimize their cost. One way of achieving this goal is to optimize the execution of Mapreduce jobs on the cluster. For a set of production jobs that are executed periodically on new data, we can perform an offline analysis for evaluating performance benefits of different optimization techniques. In this work, we consider a subset of production workloads that consists of MapReduce jobs with no dependencies. We observe that the order in which these jobs are executed can have a significant impact on their overall completion time and the cluster resource utilization. Our goal is to automate the design of a job schedule that minimizes the completion time (makespan) of such a set of MapReduce jobs. We introduce a simple abstraction where each MapReduce job is represented as a pair of map and reduce stage durations. This representation enables us to apply the classic Johnson algorithm that was designed for building an optimal two-stage job schedule. We evaluate the performance benefits of the constructed schedule through an extensive set of simulations over a variety of realistic workloads. The results are workload and cluster-size dependent, but it is typical to achieve up to 10-25 percent of makespan improvements by simply processing the jobs in the right order. However, in some cases, the simplified abstraction assumed by Johnson's algorithm may lead to a suboptimal job schedule. We design a novel heuristic, called BalancedPools, that significantly improves Johnson's schedule results (up to 15-38 percent), exactly in the situations when it produces suboptimal makespan. Overall, we observe up to 50 percent in the makespan improvements with the new BalancedPools algorithm. The results of our simulation study are validated through experiments on a 66-node Hadoop cluster.
Schedules, Production, Upper bound, Clustering algorithms, Algorithm design and analysis, Computational modeling, Business, minimized makespan, MapReduce, Hadoop, batch workloads, optimized schedule
Abhishek Verma, Ludmila Cherkasova, Roy H. Campbell, "Orchestrating an Ensemble of MapReduce Jobs for Minimizing Their Makespan", IEEE Transactions on Dependable and Secure Computing, vol.10, no. 5, pp. 314-327, Sept.-Oct. 2013, doi:10.1109/TDSC.2013.14
[1] J. Dean and S. Ghemawat, "MapReduce: Simplified Data Processing on Large Clusters," Comm. ACM, vol. 51, no. 1, pp. 107-113, 2008.
[2] C. Olston et al., "Pig Latin: A Not-So-Foreign Language for Data Processing," Proc. ACM SIGMOD Int'l Conf. Management of Data, 2008.
[3] A. Thusoo et al., "Hive - A Warehousing Solution over a Map-Reduce Framework," Proc. VLDB Endowment, vol. 2, pp. 1626-1629, 2009.
[4] S. Wu, F. Li, S. Mehrotra, and B.C. Ooi, "Query Optimization for Massively Parallel Data Processing," Proc. Second ACM Symp. Cloud Computing (SOCC), 2011.
[5] R. Lee, T. Luo, Y. Huai, F. Wang, Y. He, and X. Zhang, "YSmart: Yet Another SQL-to-MapReduce Translator," Proc. 31st Int'l Conf. Distributed Computing Systems (ICDCS '11), 2011.
[6] A. Thusoo et al., "Data Warehousing and Analytics Infrastructure at Facebook," Proc. ACM SIGMOD Int'l Conf. Management of Data, 2010.
[7] T. Nykiel et al., "MRShare: Sharing across Multiple Queries in MapReduce," Proc. VLDB Endowment, vol. 3, pp. 494-505, 2010.
[8] X. Wang, C. Olston, A. Sarma, and R. Burns, "CoScan: Cooperative Scan Sharing in the Cloud," Proc. Second ACM Symp. Cloud Computing (SOCC), 2011.
[9] S. Johnson, "Optimal Two- and Three-Stage Production Schedules with Setup Times Included," Naval Research Logistics Quarterly, vol. 1, pp. 61-68, 1954.
[10] A. Verma, L. Cherkasova, and R.H. Campbell, "Play It Again, SimMR!" Proc. IEEE Int'l Conf. Cluster Computing (Cluster '11), 2011.
[11] A. Verma, L. Cherkasova, and R.H. Campbell, "ARIA: Automatic Resource Inference and Allocation for MapReduce Environments," Proc. Eighth ACM Int'l Conf. Autonomic Computing (ICAC), 2011.
[12] R. Graham, "Bounds for Certain Multiprocessing Anomalies," Bell System Technical J., vol. 45, pp. 1563-1581, 1966.
[13] A. Verma, L. Cherkasova, and R.H. Campbell, "Resource Provisioning Framework for MapReduce Jobs with Performance Goals," Proc. 12th ACM/IFIP/USENIX Middleware Conf., 2011.
[14] "Fair Scheduler Guide," , 2013.
[15] M. Garey and D. Johnson, Computers and Intractability: A Guide to the Theory of NP-completeness. W.H. Freeman & Co., 1979.
[16] M. Zaharia et al., "Delay Scheduling: A Simple Technique for Achieving Locality and Fairness in Cluster Scheduling," Proc. Fifth European Conf. Computer Systems (EuroSys), 2010.
[17] S. Rao et al., "Sailfish: A Framework for Large Scale Data Processing," Proc. ACM Symp. Cloud Computing, 2012.
[18] "Capacity Scheduler Guide," common/docs/r0.20.1 capacity_scheduler.html, 2013.
[19] S. Kavulya, J. Tan, R. Gandhi, and P. Narasimhan, "An Analysis of Traces from a Production MapReduce Cluster," Proc. IEEE/ACM 10th Int'l Conf. Cluster, Cloud and Grid Computing (CCGrid '10), 2010.
[20] J. Blazewicz, Scheduling in Computer and Manufacturing Systems. Springer-Verlag, 1996.
[21] G. Blelloch, P. Gibbons, and Y. Matias, "Provably Efficient Scheduling for Languages with Fine-Grained Parallelism," J. ACM, vol. 46, no. 2, pp. 281-321, 1999.
[22] R. Blumofe and C. Leiserson, "Scheduling Multithreaded Computations by Work Stealing," J. ACM, vol. 46, no. 5, pp. 720-748, 1999.
[23] C. Chekuri and M. Bender, "An Efficient Approximation Algorithm for Minimizing Makespan on Uniformly Related Machines," Proc. Sixth Conf. Integer Programming and Combinatorial Optimization, pp. 383-393, 1998.
[24] C. Chekuri and S. Khanna, "Approximation Algorithms for Minimizing Average Weighted Completion Time," Handbook of Scheduling: Algorithms, Models, and Performance Analysis, CRC Press, 2004.
[25] B. Lampson, "A Scheduling Philosophy for Multiprocessing Systems," Comm. ACM, vol. 11, no. 5, pp. 347-360, 1968.
[26] J. Leung, Handbook of Scheduling: Algorithms, Models, and Performance Analysis. CRC Press, 2004.
[27] L. Tan and Z. Tari, "Dynamic Task Assignment in Server Farms: Better Performance by Task Grouping," Proc. Int'l Symp. Computers and Comm. (ISCC), 2002.
[28] T. Adam, K. Chandy, and J. Dickson, "A Comparison of List Schedules for Parallel Processing Systems," Comm. ACM, vol. 17, no. 12, pp. 685-690, 1974.
[29] D. Lifka, "The ANL/IBM SP Scheduling System," Proc. Workshop Job Scheduling Strategies for Parallel Processing, 1995.
[30] J. Wolf et al., "FLEX: A Slot Allocation Scheduling Optimizer for MapReduce Workloads," Proc. ACM/IFIP/USENIX Int'l Middleware Conf., 2010.
[31] H. Herodotou and S. Babu, "Profiling, What-If Analysis and Cost-Based Optimization of MapReduce Programs," Proc. VLDB Endowment, vol. 4, no. 11, pp. 1111-1122, 2011.
[32] B. Moseley, A. Dasgupta, R. Kumar, and T. Sarlós, "On Scheduling in Map-Reduce and Flow-Shops," Proc. 23rd ACM Symp. Parallelism in Algorithms and Architectures (SPAA), 2011.
[33] J. Polo et al., "Resource-Aware Adaptive Scheduling for Map- Reduce Clusters," Proc. 12th ACM/IFIP/USENIX Middleware Conf., 2011.