The Community for Technology Leaders
RSS Icon
Issue No.04 - April (2010 vol.21)
pp: 480-493
Hui Li , SAP AG-SAP Research, Karlsruhe
Grid computing proves to be a successful paradigm for large-scale distributed data processing, and global eScience Grids have been in production for years (e.g., LCG and OSG). The majority of applications running on these production environments can be characterized as massive CPU-intensive batch jobs (or “bag-of-tasks”), sometimes considered as the “killer” application for the Grid. A deep understanding of its main workload characteristics is not only necessary for realistic performance evaluation of the existing system, but also crucial to generate new insights into better resource allocation schemes. This paper presents a comprehensive statistical analysis of the workloads on production eScience Grid environments. We focus on second-order statistics and the scaling behavior of main job characteristics, namely job arrivals and job runtimes. A range of autocorrelation structures is identified and analyzed, including pseudoperiodicity, short-range dependence (SRD), and long-range dependence (LRD). We further develop mathematical models that are able to capture these salient properties in the workloads. Workload models, in turn, enable us to quantitatively evaluate the performance impacts of autocorrelations in Grid scheduling. The results indicate that autocorrelations in workloads result in system performance degradation, sometimes the difference can be as large as up to several orders of magnitude. Nevertheless, better performance can be achieved at the Grid level under bursty local background workloads. Such effects of workloads on systems are extensively analyzed and explained.
Workload modeling, burstiness, autocorrelation in workloads, performance evaluation, Grid computing.
Hui Li, "Realistic Workload Modeling and Its Performance Impacts in Large-Scale eScience Grids", IEEE Transactions on Parallel & Distributed Systems, vol.21, no. 4, pp. 480-493, April 2010, doi:10.1109/TPDS.2009.99
[1] I. Foster, C. Kesselman, and S. Tuecke, "The Anatomy of the Grid: Enabling Scalable Virtual Organizations," Lecture Notes in Computer Science, Springer, 2001.
[2] K. Czajkowski, S. Fitzgerald, I. Foster, and C. Kesselman, "Grid Information Services for Distributed Resource Sharing," Proc. 10th IEEE Symp. High Performance Distributed Computing (HPDC), 2001.
[3] Grid Resource Management: State of the Art and Future Trends, J. Nabrzyski, J.M. Schopf, and J. Weglarz, eds., Springer, 2003.
[4] H. Li, "Workload Characterization, Modeling, and Prediction in Grid Computing," PhD dissertation, Leiden Inst. of Advanced Computer Science, Leiden Univ. 2007.
[5] U. Lublin and D.G. Feitelson, "The Workload on Parallel Supercomputers: Modeling the Characteristics of Rigid Jobs," J. Parallel and Distributed Computing, vol. 63, no. 11, pp. 1105-1122, 2003.
[6] H. Li, D. Groep, and L. Wolters, "Workload Characteristics of a Multi-Cluster Supercomputer," Lecture Notes on Computer Science, pp. 176-193, Springer, 2005.
[7] E. Medernach, "Workload Analysis of a Cluster in a Grid Environment," Proc. 11th Workshop Job Scheduling Strategies for Parallel Processing, 2005.
[8] M.S. Squillante, D.D. Yao, and L. Zhang, "The Impact of Job Arrival Patterns on Parallel Scheduling," ACM SIGMETRICS Performance Evaluation Rev., vol. 26, no. 4, pp. 52-59, Dec. 1999.
[9] P. Abry, R. Baraniuk, P. Flandrin, R. Riedi, and D. Veitch, "The Multiscale Nature of Network Traffic: Discovery, Analysis, and Modelling," IEEE Signal Processing Magazine, vol. 19, no. 3, pp. 28-46, 2002.
[10] H. Li and M. Muskulus, "Analysis and Modeling of Job Arrivals in a Production Grid," ACM SIGMETRICS Performance Evaluation Rev., vol. 34, no. 4, pp. 59-70, 2007.
[11] S.B. Lowen and M.C. Teich, Fractal-Based Point Processes. John Wiley and Sons, Inc., 2005.
[12] W. Fischer and K. Meier-Hellstern, "The Markov-Modulated Poisson Process (MMPP) Cookbook," Performance Evaluation, vol. 18, no. 2, pp. 149-171, 1993.
[13] H. Li and L. Wolters, "Towards a Better Understanding of Workload Dynamics on Data-Intensive Clusters and Grids," Proc. 21st IEEE Int'l Parallel and Distributed Processing Symp. (IPDPS), 2007.
[14] S.G. Mallat and Z. Zhang, "Matching Pursuits with Time-Frequency Dictionaries," IEEE Trans. Signal Processing, vol. 41, no. 12, pp. 3397-3415, Dec. 1993.
[15] J. Beran, Statistics for Long Memory Processes. Chapman and Hall, 1994.
[16] P. Abry, M.S. Taqqu, P. Flandrin, and D. Veitch, "Wavelets for the Analysis, Estimation, and Synthesis of Scaling Data," Self-Similar Network Traffic and Performance Evaluation, K. Park and W. Willinger, eds., Wiley, 2000.
[17] W. Leland, M. Taqqu, W. Willinger, and D. Wilson, "On the Self-Similar Nature of Ethernet Traffic (Extended Version)," IEEE/ACM Trans. Networking, vol. 2, no. 1, pp. 1-15, Feb. 1994.
[18] A. Feldmann, A.C. Gilbert, and W. Willinger, "Data Networks as Cascades: Investigating the Multifractal Nature of Internet WAN Traffic," Proc. ACM SIGCOMM, pp. 42-55, 1998.
[19] R.H. Riedi, M.S. Crouse, V.J. Ribeiro, and R.G. Baraniuk, "A Multifractal Wavelet Model with Application to Network Traffic," IEEE Trans. Information Theory, vol. 45, no. 3, pp. 992-1019, 1999.
[20] P. Abry and D. Veitch, "Wavelet Analysis of Long-Range Dependent Traffic," IEEE Trans. Information Theory, vol. 44, no. 1, pp. 2-15, Jan. 1998.
[21] M.S. Taqqu, W. Willinger, and R. Sherman, "Proof of a Fundamental Result in Self-Similar Traffic Modeling," ACM CCR: Computer Comm. Rev., vol. 27, pp. 5-23, 1997.
[22] S. Thurner, S.B. Lowen, M. Feurstein, C. Heneghan, H.G. Feichtinger, and M.C. Teich, "Analysis, Synthesis, and Estimation of Fractal-Rate Stochastic Point Processes," Fractals, vol. 5, pp. 565-595, 1997.
[23] C. Fraley and A.E. Raftery, "Model-Based Clustering, Discriminant Analysis, and Density Estimation," J. Am. Statistical Assoc., vol. 97, pp. 611-631, 2002.
[24] C. Fraley and A.E. Raftery, "Mclust Version 3 for R: Normal Mixture Modeling and Model-Based Clustering," Technical Report 504, Dept. of Statistics, Univ. of Washington, 2006.
[25] P.J. Denning, "The Locality Principle," Comm. ACM, vol. 48, no. 7, pp. 19-24, 2005.
[26] D.G. Feitelson, "Locality of Sampling," Technical Report 2006-16, School of Computer Science and Eng., The Hebrew Univ. of Jerusalem, 2006.
[27] S.M. Ross, Introduction to Probability Models, eighth ed. Academic Press, 2003.
[28] H. Li, M. Muskulus, and L. Wolters, "Modeling Correlated Workloads by Combining Model Based Clustering and a Localized Sampling Algorithm," Proc. 21st ACM Int'l Conf. Supercomputing (ICS), 2007.
[29] R. Buyya and M. Murshed, "GridSim: A Toolkit for the Modeling and Simulation of Distributed Resource Management and Scheduling for Grid Computing," Concurrency and Computation: Practice and Experience (CCPE), vol. 14, 2002.
[30] M. Maheswaran, S. Ali, H.J. Siegel, D. Hensgen, and R.F. Freund, "Dynamic Mapping of a Class of Independent Tasks onto Heterogeneous Computing Systems," J. Parallel and Distributed Computing, vol. 59, no. 2, pp. 107-131, 1999.
[31] D.G. Feitelson, "Workload Modeling for Performance Evaluation," Lecture Notes in Computer Science, pp. 114-141, Springer, 2002.
[32] W. Cirne and F. Berman, "A Comprehensive Model of the Supercomputer Workload," Proc. IEEE Fourth Ann. Workshop Workload Characterization, 2001.
[33] B. Song, C. Ernemann, and R. Yahyapour, "Parallel Computer Workload Modeling with Markov Chains," Lecture Notes in Computer Science, pp. 47-62, Springer, 2004.
[34] D. Veitch and P. Abry, "A Wavelet Based Joint Estimator of the Parameters of Long-Range Dependence," IEEE Trans. Information Theory, special issue on multiscale statistical signal analysis and its applications," vol. 45, no. 3, pp. 878-897, Apr. 1999.
[35] R. Buyya, M. Murshed, D. Abramson, and S. Venugopal, "Scheduling Parameter Sweep Applications on Global Grids: A Deadline and Budget Constrained Cost Time Optimization Algorithm," Software Practice and Experience, vol. 35, pp. 491-512, 2005.
[36] C. Dumitrescu, I. Raicu, and I. Foster, "Di-Gruber: A Distributed Approach to Grid Resource Brokering," Proc. Conf. Supercomputing (SC), 2005.
[37] K. Ranganathan and I. Foster, "Decoupling Computation and Data Scheduling in Distributed Data-Intensive Applications," Proc. 11th IEEE Symp. High Performance Distributed Computing (HPDC), 2002.
[38] H. Casanova, A. Legrand, D. Zagorodnov, and F. Berman, "Heuristics for Scheduling Parameter Sweep Applications in Grid Environments," Proc. Ninth Heterogeneous Computing Workshop (HCW '00), pp. 349-363, 2000.
[39] S. Song, K. Hwang, and Y.-K. Kwok, "Trusted Grid Computing with Security Binding and Trust Integration," J. Grid Computing, vol. 3, pp. 53-73, 2005.
[40] A. Bucur and D. Epema, "Trace-Based Simulations of Processor Co-Allocation Policies in Multiclusters," Proc. 12th IEEE Symp. High Performance Distributed Computing (HPDC), pp. 70-79, 2003.
[41] R. Ranjan, A. Harwood, and R. Buyya, "SLA-Based Cooperative Superscheduling Algorithms for Computational Grids," ACM Trans. Autonomous and Adaptive Systems, 2007.
[42] L. He, S. Jarvis, D. Spooner, D. Bacigalupo, G. Tan, and G. Nudd, "Mapping DAG-Based Applications to Multiclusters with Background Workload," Proc. Fifth IEEE Symp. Cluster Computing and the Grid (CCGrid), pp. 855-862, 2005.
[43] A. Ramakrishnan, G. Singh, H. Zhao, E. Deelman, R. Sakellariou, K. Vahi, K. Blackburn, D. Meyers, and M. Samidi, "Scheduling Data Intensive Workflows onto Storage-Constrained Distributed Resources," Proc. Seventh IEEE Symp. Cluster Computing and the Grid (CCGrid), 2007.
[44] S. Venugopal and R. Buyya, "A Set Coverage-Based Mapping Heuristic for Scheduling Distributed Data-Intensive Applications on Global Grids," Proc. Seventh IEEE/ACM Int'l Conf. Grid Computing (Grid), 2006.
[45] Q. Zhang, N. Mi, A. Riska, and E. Smirni, "Load Unbalancing to Improve Performance under Autocorrelated Traffic," Proc. IEEE Int'l Conf. Distributed Computing Systems (ICDCS), 2006.
9 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool