This Article 
 Bibliographic References 
 Add to: 
Directory Reference Patterns in Hierarchical File Systems
June 1989 (vol. 1 no. 2)
pp. 238-247

The authors present a brief description of data on directory reference patterns collected from a 4.2BSD UNIX system. These data are used to examine the importance of the name lookup overhead involved in opening and using files. The analysis shows that paths in the environment are relatively long and that, in the absence of caching, name resolution overhead accounts for over 70% of the disk blocks referenced to open and use files. These results confirm recent conjectures on the high level of directory activity in UNIX file systems. Directory references exhibit strong locality, though, making caches an effective way to decrease directory overhead. Simulations of a least recently used (IRU) whole directory cache show that a cache holding just ten nodes achieves an 85% hit ratio. The implications of these results on the design of both local and distributed file systems are discussed.

[1] C. Ellis and R. Floyd, "The Roe file system," inProc. 3rd Symp. Reliability in Distributed Software and Database Syst., Oct. 1983, pp. 175-181.
[2] R. A. Floyd, "Directory reference patterns in a UNIX environment," Dep. Comput. Sci., Univ. Rochester, Tech. Rep. 179, Aug. 1986.
[3] R. A. Floyd, "Short-term file reference patterns in a UNIX environment," Dept. Comput. Sci., Univ. Rochester, Tech. Rep. 177, Mar. 1986.
[4] R. A. Floyd, "Transparency in distributed file systems," Ph.D. dissertation, Dep. Comput. Sci., Univ. Rochester, 1989.
[5] W. Joy, E. Cooper, R. Fabry, S. Leffler, K. McKusick, and D. Mosher, "4.2BSD system manual," inThe UNIX Programmer's Manual, Seventh Edition, Virtual VAX-11 Version, Vol. 2, Bell Labs., modified by the Univ. Calif., Berkeley, CA, Mar. 1983.
[6] S. Leffler, M. Karels, and M. McKusick, "Measuring and improving the performance of 4.2BSD," inProc. 1984 USENIX Summer Conf., June 1984, pp. 237-252.
[7] S. Leffler, private communication, Aug. 1986.
[8] M. McKusick, W. Joy, S. Leffler, and R. Fabry, "A fast file system for UNIX,"ACM Trans. Comput. Syst., vol. 2, no. 3, pp. 181-197, Aug. 1984.
[9] J. Mogul, private communication, July 1986.
[10] J. Mogul, "Representing information about files," Ph.D. dissertation, Dep. Comput. Sci., Stanford Univ., Stanford, CA, STAN-CS-86- 1103, Mar. 1986.
[11] S. Mullender and A. Tanenbaum, "Immediate files,"Software-- Practice Experience, vol. 14, pp. 365-368, Apr. 1984.
[12] D. Nowitz and M. Lesk, "A dial-up network of UNIX systems," inThe UNIX Programmer's Manual, Seventh Edition, Vol. 2, Bell Labs., Aug. 1978.
[13] J. Ousterhout, H. Da Costa, D. Harrison, J. Kunze, M. Kupfer, and J. Thompson, "A trace driven analysis of the UNIX 4.2BSD file system," Dep. Elec. Eng. Comput. Sci., Univ. Calif., Berkeley, CA, UCB/Comput. Sci. Dep. 85/230, Apr. 1985.
[14] J. Porcar, "File migration in distributed computer systems," Lawrence Berkeley Lab., Rep. LBL-14763, July 1982.
[15] D. Ritchie and K. Thompson, "The UNIX time-sharing system,"Bell Syst. Tech. J., vol. 57, no. 6, pp. 1905-1930, July-Aug. 1978.
[16] M. Satyanarayanan, "A study of file sizes and functional lifetimes,"Operat. Syst. Rev., vol. 15, pp. 96-108, Dec. 1981.
[17] M. Satyanarayanan, "A methodology for modelling storage systems and its application to a network file system," Dep. Comput. Sci., Carnegie-Mellon Univ., Pittsburgh, PA, Rep. CS-83-109, Mar. 1983.
[18] M. Satyanarayan,et al., The ITC distributed file system: Principles and design," inProc. 10th Symp. Operating System Principles, Orcas Island, WA, Dec. 1985, pp. 35-50.
[19] A. Sheltzer, R. Lindell, and G. Popek, "Name service locality and cache design in a distributed operating system," inProc. 6th Int. Conf. Distributed Comput. Syst., May 1986, pp. 515-522.
[20] A. Smith, "Analysis of long term file reference patterns for application to file migration algorithms,"IEEE Trans. Software Eng., vol. SE-7, pp. 403-417, July 1981.
[21] A. J. Smith, "Disk cache-miss ratio analysis and design considerations,"ACM Trans. Comput. Syst., vol. 3, no. 3, pp. 161-203, Aug. 1985.
[22] E. Stritter, "File migration," Stanford Univ., Stanford, CA, Rep. STAN-CS-77-592, Mar. 1977.
[23] W. Tichy and R. Zuwang, "Towards a distributed file system," inProc. 1984 USENIX Summer Conf., June 1984, pp. 87-97.
[24] B. Walker et al., "The Locus Distributed Operating System,"Proc. Ninth ACM Symp. Operating Systems Principles, Oct. 1983, pp. 49-70.
[25] S. Zhou, H. Da Costa, and A. Smith, "A file system tracing package for Berkeley UNIX," Dep. Elec. Eng. Comput. Sci., Univ. Calif., Berkeley, CA, UCB/Comput. Sci. Dep. 85/235.

Index Terms:
file opening; file use; local file systems; hierarchical file systems; directory reference patterns; 4.2BSD UNIX system; name lookup overhead; paths; environment; name resolution overhead; disk blocks; locality; caches; least recently used; IRU; nodes; distributed file systems; data handling; distributed databases; file organisation
R.A. Floyd, C. Schlatter Ellis, "Directory Reference Patterns in Hierarchical File Systems," IEEE Transactions on Knowledge and Data Engineering, vol. 1, no. 2, pp. 238-247, June 1989, doi:10.1109/69.87963
Usage of this product signifies your acceptance of the Terms of Use.