This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
High-Level Buffering for Hiding Periodic Output Cost in Scientific Simulations
March 2006 (vol. 17 no. 3)
pp. 193-204
Marianne Winslett, IEEE Computer Society

Abstract—Scientific applications often need to write out large arrays and associated metadata periodically for visualization or restart purposes. In this paper, we present active buffering, a high-level transparent buffering scheme for collective I/O, in which processors actively organize their idle memory into a hierarchy of buffers for periodic output data. It utilizes idle memory on the processors, yet makes no assumption regarding runtime memory availability. Active buffering can perform background I/O while the computation is going on, is extensible to remote I/O for more efficient data migration, and can be implemented in a portable style in today's parallel I/O libraries. It can also mask performance problems of scientific data formats used by many scientists. Performance experiments with both synthetic benchmarks and real simulation codes on multiple platforms show that active buffering can greatly reduce the visible I/O cost from the application's point of view.

[1] A. Acharya and S. Setia, “Availability and Utility of Idle Memory in Workstation Clusters,” Proc. SIGMETRICS, 1999.
[2] G. Agrawal, A. Acharya, and J. Saltz, “An Interprocedural Framework for Placement of Asynchronous I/O Operations,” Proc. 10th ACM Int'l Conf. Supercomputing, 1996.
[3] J. Bester, I. Foster, C. Kesselman, J. Tedesco, and S. Tuecke, “GASS: A Data Movement and Access Service for Wide Area Computing Systems,” Proc. Sixth Workshop I/O in Parallel and Distributed Systems, 1999.
[4] R. Bordawekar, J. Rosario, and A. Choudhary, “Design and Evaluation of Primitives for Parallel I/O,” Proc. Supercomputing Conf., 1993.
[5] P. Dickens and R. Thakur, “Improving Collective I/O Performance Using Threads,” Proc. Int'l Parallel Processing Symp. and Symp. Parallel and Distributed Processing (IPPS/SPDP), 1999.
[6] I. Foster and C. Kesselman, “Globus: A Metacomputing Infrastructure Toolkit,” Int'l J. Supercomputer Applications, vol. 11, no. 2, 1997.
[7] FLASH I/O Benchmark Routine, http://www.ucolick.org/~zingaleflash_benchmark_io /, 2005.
[8] HDF5— A New Generation of HDF, http://hdf.ncsa.uiuc.edu/HDF5doc/, 2005.
[9] HDF 4.1r3 User's Guide, http://hdf.ncsa.uiuc.eduUG41r3_html/, 1999.
[10] J. Huber Jr., C. Elford, D. Reed, A. Chien, and D. Blumenthal, “PPFS: A High Performance Portable Parallel File System,” Proc. Int'l Conf. Supercomputing, 1995.
[11] D. Kotz, “Disk-Directed I/O for MIMD Multiprocessors,” Proc. Symp. Operating Systems Design and Implementation, 1994.
[12] J. Lee, X. Ma, R. Ross, R. Thakur, and M. Winslett, “RFS: Efficient and Flexible Remote File Access for MPI-IO,” Proc. IEEE Int'l Conf. Cluster Computing, 2004.
[13] J. Lee, X. Ma, M. Winslett, and S. Yu, “Active Buffering Plus Compressed Migration: An Integrated Solution to Parallel Simulations' Data Transport Needs,” Proc. 16th ACM Int'l Conf. Supercomputing, June 2002.
[14] J. Lee, M. Winslett, X. Ma, and S. Yu, “Tuning High-Performance Scientific Codes: The Use of Performance Models to Control Resource Usage During Data Migration and I/O,” Proc. 15th ACM Int'l Conf. Supercomputing, July 2001.
[15] J. Li, W. Liao, A. Choudhary, R. Ross, R. Thakur, W. Gropp, R. Latham, A. Siegel, B. Gallagher, and M. Zingale, “Parallel netCDF: A High-Performance Scientific I/O Interface,” Proc. Conf. Supercomputing, 2003.
[16] X. Ma, M. Winslett, J. Lee, and S. Yu, “Faster Collective Output through Active Buffering,” Proc. Int'l Parallel and Distributed Processing Symp., 2002.
[17] X. Ma, M. Winslett, J. Lee, and S. Yu, “Improving MPI-IO Output Performance with Active Buffering Plus Threads,” Proc. Int'l Parallel and Distributed Processing Symp., 2003.
[18] J. May, Parallel I/O for High Performance Computing. Morgan Kaufmann, 2001.
[19] Message Passing Interface Forum, MPI: Message-Passing Interface Standard, June 1995.
[20] J. Moore and M.J. Quinn, “Enhancing Disk-Directed I/O for Fine-Grained Redistribution of File Data,” Parallel Computing, vol. 23, nos. 4-5, 1997.
[21] S. More, A. Choudhary, I. Foster, and M.Q. Xu, “MTIO: A Multi-Threaded Parallel I/O System,” Proc. 11th Int'l Parallel Processing Symp., 1997.
[22] J. Nieplocha and I. Foster, “Disk Resident Arrays: An Array-Oriented I/O Library for Out-of-Core Computation,” Proc. Sixth Symp. Frontiers of Massively Parallel Computation, 1996.
[23] J. Nieplocha, I. Foster, and H. Dachsel, “Distant I/O: One-Sided Access to Secondary Storage on Remote Processors,” Proc. Seventh IEEE Int'l Symp. High Performance Distributed Computing, 1998.
[24] N. Nieuwejaar and D. Kotz, “The Galley Parallel File System,” Parallel Computing, vol. 23, no. 4, 1997.
[25] J. No, S. Park, J. Carretero, A. Choudhary, and P. Chen, “Design and Implementation of a Parallel I/O Runtime System for Irregular Applications,” Proc. Proc. Int'l Parallel Processing Symp. and Symp. Parallel and Distributed Processing (IPPS/SPDP), 1998.
[26] R.A. Oldfield, D.E. Womble, and C.C. Ober, “Efficient Parallel I/O in Seismic Imaging,” The Int'l J. High Performance Computing Applications, vol. 12, no. 3, 1998.
[27] K.E. Seamons, Y. Chen, P. Jones, J. Jozwiak, and M. Winslett, “Server-Directed Collective I/O in Panda,” Proc. Conf. Supercomputing, 1995.
[28] H. Shan, L. Oliker, R. Biswas, and J. Pal Singh, “Comparing Three Programming Models for Adaptive Applications on SGI Origin 2000,” Proc. Conf. Supercomputing, 2000.
[29] R. Thakur, A. Choudhary, R. Bordawekar, S. More, and S. Kuditipudi, “Passion: Optimized I/O for Parallel Applications,” Computer, vol. 29, no. 6, 1996.
[30] R. Thakur, W. Gropp, and E. Lusk, “On Implementing MPI-IO Portably and with High Performance,” Proc. Sixth Workshop I/O in Parallel and Distributed Systems, 1999.

Index Terms:
Parallel I/O library design, performance optimization, experimentation.
Citation:
Xiaosong Ma, Jonghyun Lee, Marianne Winslett, "High-Level Buffering for Hiding Periodic Output Cost in Scientific Simulations," IEEE Transactions on Parallel and Distributed Systems, vol. 17, no. 3, pp. 193-204, March 2006, doi:10.1109/TPDS.2006.36
Usage of this product signifies your acceptance of the Terms of Use.