This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
File-Access Characteristics of Parallel Scientific Workloads
October 1996 (vol. 7 no. 10)
pp. 1075-1089

Abstract—Phenomenal improvements in the computational performance of multiprocessors have not been matched by comparable gains in I/O system performance. This imbalance has resulted in I/O becoming a significant bottleneck for many scientific applications. One key to overcoming this bottleneck is improving the performance of multiprocessor file systems. The design of a high-performance multiprocessor file system requires a comprehensive understanding of the expected workload. Unfortunately, until recently, no general workload studies of multiprocessor file systems have been conducted. The goal of the CHARISMA project was to remedy this problem by characterizing the behavior of several production workloads, on different machines, at the level of individual reads and writes. The first set of results from the CHARISMA project describe the workloads observed on an Intel iPSC/860 and a Thinking Machines CM-5. This paper is intended to compare and contrast these two workloads for an understanding of their essential similarities and differences, isolating common trends and platform-dependent variances. Using this comparison, we are able to gain more insight into the general principles that should guide multiprocessor file-system design.

[1] D. Kotz and N. Nieuwejaar, “File-System Workload on a Scientific Multiprocessor,” IEEE Parallel and Distributed Technology, pp. 51-60, Spring 1995.
[2] A. Purakayastha, C.S. Ellis, D. Kotz, N. Nieuwejaar, and M. Best, Characterizing Parallel File-Access Patterns on a Large-Scale Multiprocessor Proc. Ninth Int'l Parallel Processing Symp., pp. 165-172, Apr. 1995.
[3] R. Floyd, "Short-Term File Reference Patterns in a UNIX Environment," Technical Report 177, Dept. of Computer Science, Univ. of Rochester, Mar. 1986.
[4] R.A. Floyd and C.S. Ellis, "Directory Reference Patterns in Hierarchical File Systems," IEEE Trans. Knowledge and Data Eng., vol. 1, no. 2, pp. 238-247, June 1989.
[5] J.K. Ousterhout et al., "A Trace-Driven Analysis of the UNIX 4.2 BSD File System," Proc. 10th Symp. Operating Systems Principles, pp. 15-24, Dec. 1985.
[6] M. Baker, J.H. Hartman, M.D. Kupfer, K.W. Shirriff, and J. Ousterhout, "Measurements of a Distributed File System," Proc. 13th ACM Symp. Operating Systems Principles, pp. 198-211, Oct. 1991.
[7] K. Ramakrishnan, P. Biswas, and R. Karedla, “Analysis of File I/O Traces in Commercial Computing Environments,” Performance Evaluation Rev., Vol. 20, No. 1, June 1992, pp. 78-90.
[8] J.M. del Rosario, A.N. Choudhary, “High Performance I/O for Massively Parallel Computers: Problems and Prospects,” Computer, vol. 27, no. 3,pp. 59–68, 1994.
[9] M.L. Powell, "The DEMOS File System," Proc. Sixth ACM Symp. Operating Systems Principles, pp. 33-42, Nov. 1977.
[10] E. Miller and R. Katz, "Input/Output Behavior of Supercomputing Applications," Proc. Supercomputing '91, pp. 567-576, 1991.
[11] E.L. Miller and R.H. Katz, "An Analysis of File Migration in a UNIX Supercomputing Environment," Proc. 1993 Winter USENIX Conf., pp. 421-434, Jan. 1993.
[12] B.K. Pasquale and G.C. Polyzos, “A Static Analysis of I/O Characteristics of Scientific Applications in a Production Workload,” Proc. Supercomputing '93, pp. 388–397, 1993.
[13] B.K. Pasquale and G.C. Polyzos, "A Case Study of a Scientific Application I/O Behavior," Proc. Int'l Workshop Modeling, Analysis, and Simulation of Computer and Telecommunication Systems, pp. 101-106, 1994.
[14] T.W. Crockett, "File Concepts for Parallel I/O," Proc. Supercomputing '89, pp. 574-579, 1989.
[15] D.F. Kotz and C.S. Ellis, “Prefetching in File Systems for MIMD Multiprocessors,” IEEE Trans. Parallel and Distributed Computing, vol. 1, no. 2,pp. 218–230, 1990.
[16] A.L.N. Reddy and P. Banerjee, "A Study of I/O Behavior of Perfect Benchmarks on a Multiprocessor," Proc. 17th Ann. Int'l Symp. Computer Architecture, pp. 312-321, 1990.
[17] R. Cypher, A. Ho, S. Konstantinidou, and P. Messina, "Architectural Requirements of Parallel Scientific Applications with Explicit Communication," Proc. 20th Ann. Int'l Symp. Computer Architecture, pp. 2-13, May 1993.
[18] N. Galbreath, W. Gropp, and D. Levine, “Applications-Driven Parallel I/O,” Proc. Supercomputing '93, pp. 462-471, Nov. 1993.
[19] P.E. Crandall, R.A. Aydt, A.A. Chien, and D.A. Reed, “Input/Output Characteristics of Scalable Parallel Applications,” Proc. Supercomputing, Dec. 1995.
[20] S.J. Baylor and C.E. Wau, "Parallel I/O Workload Characteristics Using Vesta," Input/Output in Parallel and Distributed Computer Systems, R. Jain, J. Werth, and J.C. Browne, eds., chapter 7, pp. 167-185. Kluwer Academic Publishers, 1996.
[21] P. Pierce, "A Concurrent File System for a Highly Parallel Mass Storage System," Proc. Fourth Conf. Hypercube Concurrent Computers and Applications, pp. 155-160, 1989.
[22] P.J. Roy, "Unix File Access and Caching in a Multicomputer Environment," Proc. Usenix Mach III Symp., pp. 21-37, 1993.
[23] M.L. Best et al., “CMMD I/O: A Parallel Unix I/O,” Proc. Seventh Int’l Parallel Processing Symp., IEEE Computer Society Press, Los Alamitos, Calif., 1993, pp. 489–495.
[24] D. Kotz, "Multiprocessor File System Interfaces," Proc. Second Int'l Conf. Parallel and Distributed Information Systems, pp. 194-201, 1993.
[25] S.R. Chapple and S.M. Trewin, PUL-GF Prototype User Guide, Feb. 1993, EPCC-KTP-PUL-GF-UG 0.1.
[26] P.F. Corbett et al., “Parallel Access to Files in the Vesta File System,” Proc. Supercomputing’93, CS Press, 1993, pp. 472–481.
[27] E. DeBenedictis and J.M. del Rosario, “nCube Parallel I/O Software,” Proc. 11th Int’l Phoenix Conf. Computers&Communications, CS Press, 1992, pp. 117–124.
[28] "KSR1 Technology Background," Kendall Square Research, Jan. 1992.
[29] O. Krieger and M. Stumm, "HFS: A Flexible File System for Large-Scale Multiprocessors," Proc. 1993 DAGS/PC Symp., pp. 6-14, Dartmouth Inst. for Advanced Graduate Studies, Hanover, N.H., June 1993.
[30] "Connection Machine Model CM-2 Technical Summary," Technical Report HA87-4, Thinking Machines, Apr. 1987.
[31] "Parallel File I/O Routines," MasPar Computer Corp., 1992.
[32] P. Corbett, D. Feitelson, Y. Hsu, J.-P. Prost, M. Snir, S. Fineberg, B. Nitzberg, B. Traversat, and P. Wong, "MPI-IO: A Parallel I/O Interface for MPI," Technical Report NAS-95-002, NASA Ames Research Center, Version 0.3, Jan. 1995.
[33] Intel Corporation, PSC/2 and iPSC/860 User's Guide, Apr. 1991.
[34] NASA Ames Research Center, Moffet Field, Calif., NAS User Guide, 6.1 edition, Mar. 1993.
[35] J.C. French, T.W. Pratt, and M. Das, "Performance Measurement of the Concurrent File System of the Intel iPSC/2 Hypercube," J. Parallel and Distributed Computing, vol. 17, nos. 1-2, pp. 115-1212, Jan./Feb. 1993.
[36] B. Nitzberg, "Performance of the iPSC/860 Concurrent File System," Technical Report RND-92-020, NAS Systems Division, NASA Ames, Dec. 1992.
[37] Thinking Machines Corp., CM5 Technical Summary, Nov. 1993.
[38] Thinking Machines Corp., CM5 I/O System Programming Guide Version 7.2, Sept. 1993.
[39] Thinking Machines Corporation, CMMD Reference Manual Version 3.0, May 993.
[40] NCSA Consulting Staff and NCSA CM-5 Systems Staff, personal communication, June 1994.
[41] R. Carter, B. Ciotti, S. Fineberg, and B. Nitzberg, "NHT-1 I/O Benchmarks," Technical Report RND-92-016, NAS Systems Division, NASA Ames, Nov. 1992.
[42] J.C. French, "A Global Time Reference for Hypercube Multiprocessors," Proc. Fourth Conf. Hypercube Concurrent Computers and Applications, pp. 217-220, 1989.
[43] T.T. Kwan and D.A. Reed, "Performance of the CM-5 Scalable File System," Proc. Eighth ACM Int'l Conf. Supercomputing, pp. 156-165, July 1994.
[44] N. Nieuwejaar and D. Kotz, "Low-Level Interfaces for High-Level Parallel I/O," Proc. IPPS '95 Workshop I/O in Parallel and Distributed Systems, pp. 47-62, Apr. 1995.
[45] N. Nieuwejaar and D. Kotz, "Performance of the Galley Parallel File System," Proc. Fourth Workshop Input/Output in Parallel and Distributed Systems, pp. 83-94, May 996.
[46] D. Kotz, “Disk-Directed I/O for MIMD Multiprocessors,” Technical Report PCS-TR94-226, Dept. of Computer Science, Dartmouth College, July 1994.

Index Terms:
Parallel file system, workload characterization, multiprocessor, parallel I/O, scientific computing.
Citation:
Nils Nieuwejaar, David Kotz, Apratim Purakayastha, Carla Schlatter Ellis, Michael L. Best, "File-Access Characteristics of Parallel Scientific Workloads," IEEE Transactions on Parallel and Distributed Systems, vol. 7, no. 10, pp. 1075-1089, Oct. 1996, doi:10.1109/71.539739
Usage of this product signifies your acceptance of the Terms of Use.