2008 IEEE International Conference on Cluster Computing (2008)
Sept. 29, 2008 to Oct. 1, 2008
Dan Feng , Sch. of Comput., Huazhong Univ. of Sci. & Technol., Wuhan
Qiang Zou , Sch. of Comput., Huazhong Univ. of Sci. & Technol., Wuhan
Hong Jiang , Univ. of Nebraska-Lincoln, Lincoln, NE
Yifeng Zhu , Univ. of Maine, Orono, ME
One of the challenging issues in performance evaluation of parallel storage systems through synthetic-trace-driven simulation is to accurately characterize the I/O demands of data-intensive scientific applications. This paper analyzes several I/O traces collected from different distributed systems and concludes that correlations in parallel I/O inter-arrival times are inconsistent, either with little correlation or with evident and abundant correlations. Thus conventional Poisson or Markov arrival processes are inappropriate to model I/O arrivals in some applications. Instead, a new and generic model based on the alpha-stable process is proposed and validated in this paper to accurately model parallel I/O burstiness in both workloads with little and strong correlations. This model can be used to generate reliable synthetic I/O sequences in simulation studies. Experimental results presented in this paper show that this model can capture the complex I/O behaviors of real storage systems more accurately and faithfully than conventional models, particularly for the burstiness characteristics in the parallel I/O workloads.
parallel processing, distributed memory systems
Dan Feng, Qiang Zou, Hong Jiang and Yifeng Zhu, "A novel model for synthesizing parallel I/O workloads in scientific applications," 2008 IEEE International Conference on Cluster Computing(CLUSTER), Tsukuba Japan, 2009, pp. 252-261.