2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS) (2018)
Vancouver, British Columbia, Canada
May 21, 2018 to May 25, 2018
Parallel file systems (PFSs) are widely deployed to speed up the performance of high-performance computing (HPC) applications. In recent years, hybrid PFSs that consist of HDD-SSD servers, have attracted much attention in HPC community. However, existing data layout schemes do not well consider the characteristics of heterogeneous servers and heterogeneous access patterns, thus may experience considerable inefficiencies. In this study, we propose MHA, a migratory heterogeneity-aware data layout scheme to improve the data distribution of hybrid PFS. More specifically, to accommodate heterogeneous access patterns, MHA first migrates file data into several regions, each with similar access patterns. Then, by leveraging a data access cost model, MHA determines the appropriate stripe sizes on heterogeneous servers to get the best performance on each region. We have implemented MHA under MPI-IO library on top of OrangeFS file system. Experimental results show that MHA can significantly improve the hybrid PFS I/O system performance compared to existing data layout schemes.
application program interfaces, file organisation, file servers, message passing, parallel processing, storage management
S. He, X. Sun, Y. Wang and C. Xu, "A Migratory Heterogeneity-Aware Data Layout Scheme for Parallel File Systems," 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS), Vancouver, British Columbia, Canada, 2018, pp. 1133-1142.