The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.11 - Nov. (2013 vol.39)
pp: 1564-1581
Sriram Kailasam , Dept. of Comput. Sci. & Eng., Indian Inst. of Technol. Madras, Chennai, India
Nathan Gnanasambandam , Xerox Res. Center, Webster, NY, USA
Janakiram Dharanipragada , Dept. of Comput. Sci. & Eng., Indian Inst. of Technol. Madras, Chennai, India
Naveen Sharma , Xerox Res. Center, Webster, NY, USA
ABSTRACT
Optimizing ordered throughput not only improves the system efficiency but also makes cloud bursting transparent to the user. This is critical from the perspective of user fairness in customer-facing systems, correctness in stream processing systems, and so on. In this paper, we consider optimizing ordered throughput for near real-time, data-intensive, independent computations using cloud bursting. Intercloud computation of data-intensive applications is a challenge due to large data transfer requirements, low intercloud bandwidth, and best-effort traffic on the Internet. The system model we consider is comprised of two processing stages. The first stage uses cloud bursting opportunistically for parallel processing, while the second stage (sequential) expects the output of the first stage to be in the same order as the arrival sequence. We propose three scheduling heuristics as part of an autonomic cloud bursting approach that adapt to changing workload characteristics, variation in bandwidth, and available resources to optimize ordered throughput. We also characterize the operational regimes for cloud bursting as stabilization mode versus acceleration mode, depending on the workload characteristics like the size of data to be transferred for a given compute load. The operational regime characterization helps in deciding how many instances can be optimally utilized in the external cloud.
INDEX TERMS
Cloud computing, Optimization, Scheduling,data-intensive, Cloud bursting, ordered throughput, autonomic
CITATION
Sriram Kailasam, Nathan Gnanasambandam, Janakiram Dharanipragada, Naveen Sharma, "Optimizing Ordered Throughput Using Autonomic Cloud Bursting Schedulers", IEEE Transactions on Software Engineering, vol.39, no. 11, pp. 1564-1581, Nov. 2013, doi:10.1109/TSE.2013.26
REFERENCES
[1] H. Kim, S. Chaudhari, M. Parashar, and C. Marty, "Online Risk Analytics on the Cloud," Proc. IEEE/ACM Ninth Int'l Symp. Cluster Computing and Grid, pp. 484-489, 2009.
[2] P. Mell and T. Grance, "The NIST Definition of Cloud Computing," technical report, Nat'l Inst. of Standards and Tech nology, Oct. 2009.
[3] C. Vecchiola, R.N. Calheiros, D. Karunamoorthy, and R. Buyya, "Deadline-Driven Provisioning of Resources for Scientific Applications in Hybrid Clouds with Aneka," Future Generation Computer Systems, vol. 28, no. 1, pp. 58-65, 2012.
[4] T. Bicer, D. Chiu, and G. Agrawal, "A Framework for Data-Intensive Computing with Cloud Bursting," Proc. IEEE Int'l Conf. Cluster Computing, pp. 169-177, 2011.
[5] M. Rahman, X. Li, and H.N. Palit, "Hybrid Heuristic for Scheduling Data Analytics Workflow Applications in Hybrid Cloud Environment," Proc. IEEE Int'l Symp. Parallel Distributed Processing Workshops, pp. 966-974, 2011.
[6] H. Casanova, A. Legrand, D. Zagorodnov, and F. Berman, "Heuristics for Scheduling Parameter Sweep Applications in Grid Environments," Proc. Ninth Heterogeneous Computing Workshop, pp. 349-363, 2000.
[7] M. Maheswaran, S. Ali, H.J. Siegel, D. Hensgen, and R. Freund, "Dynamic Matching and Scheduling of a Class of Independent Tasks onto Heterogeneous Computing Systems," Proc. Eighth Heterogeneous Computing Workshop, pp. 30-44, 1999.
[8] A. Giersch, Y. Robert, and F. Vivien, "Scheduling Tasks Sharing Files on Heterogeneous Master-Slave Platforms," Proc. 12th Euromicro Conf. Parallel, Distributed and Network-Based Processing, pp. 364-371, Feb. 2004.
[9] A. Rafaeli, E. Kedmi, D. Vashdi, and G. Barron, "Queues and Fairness: A Multiple Study Investigation," technical report, Technion, Israel Inst. of Technology, Haifa, Israel, 2005.
[10] R.C. Larson, "Social Justice and the Psychology of Queueing," Operations Research, vol. 35, no. 1, pp. 895-905, 1987.
[11] M. Harchol-balter, "Task Assignment with Unknown Duration," J. ACM, vol. 49, pp. 260-288, 2000.
[12] R.H. Myers and D.C. Montgomery, Response Surface Methodology: Process and Product Optimization Using Designed Experiments. John Wiley & Sons, 2002.
[13] S. Kailasam, N. Gnanasambandam, J. Dharanipragada, and N. Sharma, "Optimizing Service Level Agreements for Autonomic Cloud Bursting Schedulers," Proc. 39th Int'l Conf. Parallel Processing Workshop, pp. 285-294, 2010.
[14] A. Varga and R. Hornig, "An Overview of the OMNeT++ Simulation Environment," Proc. First Int'l Conf. Simulation Tools Techniques Comm., Networks and Systems and Workshops, pp. 1-10, 2008.
[15] M. Jain and C. Dovrolis, "Pathload: A Measurement Tool for End-to-End Available Bandwidth," Proc. Passive and Active Measurements Workshop, pp. 14-25, 2002.
[16] V.J. Ribeiro, R.H. Riedi, R.G. Baraniuk, J. Navratil, and L. Cottrell, "Pathchirp: Efficient Available Bandwidth Estimation for Network Paths," Proc. Passive and Active Measurement Workshop, Apr. 2003.
[17] Y.C. Lee and A.Y. Zomaya, "Practical Scheduling of Bag-of-Tasks Applications on Grids with Dynamic Resilience," IEEE Trans. Computers, vol. 56, no. 6, pp. 815-825, June 2007.
[18] J. Schad, J. Dittrich, and J.-A. Quiané-Ruiz, "Runtime Measurements in the Cloud: Observing, Analyzing, and Reducing Variance," Proc. PVLDB Endowment, vol. 3, no. 1, pp. 460-471, 2010.
[19] F.A.B. da Silva and H. Senger, "Improving Scalability of Bag-of-Tasks Applications Running on Master-Slave Platforms," Parallel Computing, vol. 35, pp. 57-71, Feb. 2009.
[20] K. Kaya and C. Aykanat, "Iterative-Improvement-Based Heuristics for Adaptive Scheduling of Tasks Sharing Files on Heterogeneous Master-Slave Environments," IEEE Trans. Parallel and Distributed Systems, vol. 17, no. 8, pp. 883-896, Aug. 2006.
[21] S. Asaduzzaman and M. Maheswaran, "Utilizing Unreliable Public Resources for Higher Profit and Better SLA Compliance in Computing Utilities," J. Parallel and Distributed Computing, vol. 66, no. 6, pp. 796-806, June 2006.
[22] S. Asaduzzaman and M. Maheswaran, "Strategies to Create Platforms for Differentiated Services from Dedicated and Opportunistic Resources," J. Parallel and Distributed Computing, vol. 67, no. 10, pp. 1119-1134, Oct. 2007.
[23] E. Elmroth and J. Tordsson, "A Grid Resource Broker Supporting Advance Reservations and Benchmark-Based Resource Selection," Proc. Seventh Int'l Conf. Applied Parallel Computing, pp. 1061-1070, 2004.
[24] D. Talia, R. Yahyapour, W. Ziegler, P. Wieder, J. Seidel, O. Waldrich, W. Ziegler, and R. Yahyapour, "Using SLA for Resource Management and Scheduling—A Survey," Grid Middleware and Services, pp. 335-347, Springer, 2008.
[25] C. Ernemann, V. Hamscher, A. Streit, and R. Yahyapour, "Enhanced Algorithms for Multi-Site Scheduling," Proc. Third Int'l Workshop Grid Computing, pp. 219-231, 2002.
[26] M. Hashim and E. Dick, "Koala: A Co-Allocating Grid Scheduler," Concurrency and Computation: Practices and Experience, vol. 20, pp. 1851-1876, Nov. 2008.
[27] E. Huedo, R.S. Montero, and I.M. Llorente, "A Framework for Adaptive Execution in Grids," Software Practices and Experience, vol. 34, pp. 631-651, June 2004.
[28] D. Abramson, R. Buyya, and J. Giddy, "A Computational Economy for Grid Computing and Its Implementation in the Nimrod-G Resource Broker," Future Generation Computer Systems, vol. 18, no. 8, pp. 1061-1074, 2002.
[29] S. Venugopal, "Scheduling Distributed Data-Intensive Applications on Global Grids," PhD dissertation, The Univ. of Melbourne, 2006.
[30] I. Rodero, F. Guim, J. Corbalan, L. Fong, and S.M. Sadjadi, "Grid Broker Selection Strategies Using Aggregated Resource Information," Future Generation Computer Systems, vol. 26, no. 1, pp. 72-86, 2010.
[31] M.D. Assunção, A. Costanzo, and R. Buyya, "A Cost-Benefit Analysis of Using Cloud Computing to Extend the Capacity of Clusters," Cluster Computing, vol. 13, pp. 335-347, Sept. 2010.
[32] H. Kim and M. Parashar, CometCloud: An Autonomic Cloud Engine, pp. 275-297. John Wiley & Sons, 2011.
[33] T. Guo, U. Sharma, T. Wood, S. Sahu, and P. Shenoy, "Seagull: Intelligent Cloud Bursting for Enterprise Applications," Proc. Usenix Ann. Technical Conf. (Short Paper), June 2012.
[34] R. Van den Bossche, K. Vanmechelen, and J. Broeckhove, "Cost-Efficient Scheduling Heuristics for Deadline Constrained Workloads on Hybrid Clouds," Proc. Third IEEE Int'l Conf. Cloud Computing Technology and Science, pp. 320-327, 2011.
[35] P. Fan, J. Wang, Z. Zheng, and M.R. Lyu, "Toward Optimal Deployment of Communication-Intensive Cloud Applications," Proc. Fourth IEEE Int'l Conf. Cloud Computing, pp. 460-467, 2011.
[36] G. Jung, N. Gnanasambandam, and T. Mukherjee, "Synchronous Parallel Processing of Big-Data Analytics Services to Optimize Performance in Federated Clouds," Proc. Fifth IEEE Int'l Conf. Cloud Computing, pp. 811-818, 2012.
[37] S. Kim and J. Browne, "A General Approach to Mapping of Parallel Computation upon Multiprocessor Architectures," Proc. Int'l Conf. Parallel Processing, vol. 2, pp. 1-8, 1988.
[38] G.C. Sih and E.A. Lee, "A Compile-Time Scheduling Heuristic for Interconnection-Constrained Heterogeneous Processor Architectures," IEEE Trans. Parallel and Distributed Systems, vol. 4, no. 2, pp. 175-187, Feb. 1993.
[39] H. Topcuouglu, S. Hariri, and M.-y. Wu, "Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing," IEEE Trans. Parallel and Distributed Systems, vol. 13, no. 3, pp. 260-274, Mar. 2002.
[40] C. Lin and S. Lu, "Scheduling Scientific Workflows Elastically for Cloud Computing," Proc. Fourth Int'l Conf. Cloud Computing, pp. 246-247, 2011.
[41] L. Bittencourt and E. Madeira, "HCOC: A Cost Optimization Algorithm for Workflow Scheduling in Hybrid Clouds," J. Internet Services and Applications, vol. 2, no. 3, pp. 207-227, 2011.
442 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool