The Community for Technology Leaders
2008 3rd Petascale Data Storage Workshop. (PDSW '08) (2008)
Austin, TX
Nov. 17, 2008 to Nov. 17, 2008
ISSN: 2157-7242
ISBN: 978-1-4244-4208-9
pp: 1-5
Paul Nowoczynski , Pittsburgh Supercomputing Center, PA USA
Nathan Stone , Pittsburgh Supercomputing Center, PA USA
Jared Yanovich , Pittsburgh Supercomputing Center, PA USA
Jason Sommerfield , Pittsburgh Supercomputing Center, PA USA
ABSTRACT
The PSC has developed a prototype distributed file system infrastructure that vastly accelerates aggregated write bandwidth on large compute platforms. Write bandwidth, more than read bandwidth, is the dominant bottleneck in HPC I/O scenarios due to writing checkpoint data, visualization data and post-processing (multi-stage) data. We have prototyped a scalable solution that will be directly applicable to future petascale compute platforms having of order 10^6 cores. Our design emphasizes high-efficiency scalability, low-cost commodity components, lightweight software layers, end-to-end parallelism, client-side caching and software parity, and a unique model of load-balancing outgoing I/O onto high-speed intermediate storage followed by asynchronous reconstruction to a 3rd-party parallel file system.
INDEX TERMS
checkpointing, data visualisation, input-output programs, mainframes, parallel processing, program verification, resource allocation
CITATION

P. Nowoczynski, N. Stone, J. Yanovich and J. Sommerfield, "Zest Checkpoint storage system for large supercomputers," 2008 3rd Petascale Data Storage Workshop. (PDSW '08)(PDSW), Austin, TX, 2009, pp. 1-5.
doi:10.1109/PDSW.2008.4811883
93 ms
(Ver 3.3 (11022016))