2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS) (2018)
Jul 2, 2018 to Jul 6, 2018
Cloud developers traditionally rely on purpose-specific services to provide the storage model they need for an application. In contrast, HPC developers have a much more limited choice, typically restricted to a centralized parallel file system for persistent storage. Unfortunately, these systems often offer low performance when subject to highly concurrent, conflicting I/O patterns. This makes difficult the implementation of inherently concurrent data structures such as distributed shared logs. Yet, this data structure is key to applications such as computational steering, data collection from physical sensor grids, or discrete event generators. In this paper we tackle this issue. We present SLoG, shared log middleware providing a shared log abstraction over a parallel file system, designed to circumvent the aforementioned limitations. We evaluate SLoG's design on up to 100,000 cores of the Theta supercomputer: the results show high append velocity at scale while also providing substantial benefits for other persistent backend storage systems.
Big Data, cloud computing, data structures, middleware, parallel processing
P. Matri, P. Carns, R. Ross, A. Costan, M. S. Perez and G. Antoniu, "SLoG: Large-Scale Logging Middleware for HPC and Big Data Convergence," 2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS), Vienna, Austria, 2018, pp. 1507-1512.