2012 IEEE 20th International Symposium on Modelling, Analysis & Simulation of Computer and Telecommunication Systems (MASCOTS) (2012)
Aug. 7, 2012 to Aug. 9, 2012
Current I/O techniques have pushed the write performance close to the system peak, but they usually overlook the read side of problem. With the mounting needs of scientific discovery, it is important to provide good read performance for many common access patterns. Such demand requires an organization scheme that can effectively utilize the underlying storage system. However, the mismatch between conventional data layout on disk and common scientific access patterns leads to significant performance degradation when a subset of data is accessed. To this end, we design a system-aware Optimized Chunking model, which aims to find an optimized organization that can strike for a good balance between data transfer efficiency and processing overhead. To enable such model for scientific applications, we propose SMART-IO, a two-level data organization framework that can organize the blocks of multidimensional data efficiently. This scheme can adapt data layouts based on data characteristics and underlying storage systems, and enable efficient scientific analytics. Our experimental results demonstrate that SMART-IO can significantly improve the read performance for challenging access patterns, and speed up data analytics. For a mission critical combustion simulation code S3D, Smart-IO achieves up to 72 times speedup for planar reads of a 3-D variable compared to the logically contiguous data layout.
combustion, data analysis, middleware, natural sciences computing, storage management
Y. Tian et al., "SMART-IO: SysteM-AwaRe Two-Level Data Organization for Efficient Scientific Analytics," 2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems(MASCOTS), Washington, DC USA, 2012, pp. 181-188.