loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Realistic Workload Modeling and Its Performance Impacts in Large-Scale eScience Grids
PrePrint
ISSN: 1045-9219
Hui Li, SAP Research, Karlsruhe
Grid computing proves to be a successful paradigm for large-scale distributed data processing, and global eScience Grids have been in production for years (e.g. LCG and OSG). The majority of applications running on these production environments can be characterized as massive CPU-intensive batch jobs (or "bag-of-tasks"), sometimes considered as the "killer" application for the Grid. A deep understanding of its main workload characteristics is not only necessary for realistic performance evaluation of the existing system, but also crucial to generate new insights into better resource allocation schemes. This paper presents a comprehensive statistical analysis of the workloads on production eScience Grid environments. We focus on second-order statistics and the scaling behavior of main job characteristics, namely job arrivals and job run times. A range of autocorrelation structures is identified and analyzed, including pseudo-periodicity, short-range dependence (SRD), and long-range dependence (LRD). We further develop mathematical models that are able to capture these salient properties in the workloads. Workload models, in turn, enable us to quantitatively evaluate the performance impacts of autocorrelations in Grid scheduling. The results indicate that autocorrelations in workloads result in system performance degradation, sometimes the difference can be as large as up to several orders of magnitude. Nevertheless, better performance can be achieved at the Grid level under bursty local background workloads. Such effects of workloads on systems are extensively analyzed and explained.
Index Terms:
Modeling techniques, Measurement, evaluation, modeling, simulation of multiple-processor systems
Citation:
Hui Li, "Realistic Workload Modeling and Its Performance Impacts in Large-Scale eScience Grids," IEEE Transactions on Parallel and Distributed Systems, 15 Jun. 2009. IEEE computer Society Digital Library. IEEE Computer Society, <http://doi.ieeecomputersociety.org/10.1109/TPDS.2009.99>
Usage of this product signifies your acceptance of the Terms of Use.