ABSTRACT

The efficient organization of a very large file to facilitate search and retrieval operations is an important but very complex problem. In this paper we consider the case of a large file in which the frequency of use of its component subfiles are known. We develop the organization of the file so that the average number of entries to locate individual items in it by means of binary search is minimized. The algorithm iteratively partitions the file into "saturated" subfiles, and with each successive iteration the average number of entries to locate an item is reduced until no more improvement is possible. Next, we extend the method to solve the realistic problem of designing an optimal memory hierarchy to hold the file in a computer system. The sizes of various memory components and location of various items of the frequency-dependent file are determined so that the average time to locate an item (over the totality of items) in the memory hierarchy is minimized for a given total cost of the memory system. A number of examples are given to elucidate the methods. Also, the characteristics and results of a Fortran implementation of the algorithms on the CDC 6600 are described.

INDEX TERMS

Access time, binary search, cost of memory type, file, frequency of usage, frequency partition file, item, key, mean frequency, memory hierarchy, saturated file.

CITATION

C. Ramamoorthy and n. Yeh-Hao Chin, "An Efficient Organization or Large Frequency-Dependent Files for Binary Searcking," in

*IEEE Transactions on Computers*, vol. 20, no. , pp. 1178-1187, 1971.

doi:10.1109/T-C.1971.223102

CITATIONS