This Article 
 Bibliographic References 
 Add to: 
Using Tries to Eliminate Pattern Collisions in Perfect Hashing
April 1994 (vol. 6 no. 2)
pp. 239-247

Many current perfect hashing algorithms suffer from the problem of pattern collisions. In this paper, a perfect hashing technique that uses array-based tries and a simple sparse matrix packing algorithm is introduced. This technique eliminates all pattern collisions, and, because of this, it can be used to form ordered minimal perfect hashing functions on extremely large word lists. This algorithm is superior to other known perfect hashing functions for large word lists in terms of function building efficiency, pattern collision avoidance, and retrieval function complexity. It has been successfully used to form an ordered minimal perfect hashing function for the entire 24481 element Unix word list without resorting to segmentation. The item lists addressed by the perfect hashing function formed can be ordered in any manner, including alphabetically, to easily allow other forms of access to the same list.

[1] R. Bayer and C. McCreight, "Organizations and maintenance of large ordered indexes,"Acta Inform., vol. 1, no. 3, pp. 173-189, 1972.
[2] F. Berman, M. E. Bock, E. Dittert, M. J. O'Donnell, and D. Plank, "Collections of functions for perfect hashing,"SIAM J. Comput., vol. 15, no. 2, pp. 604-618, May 1986.
[3] M. Brain and A. Tharp, "Near-perfect hashing of large word sets,"Software Practice and Experience, vol. 19, no. 10, pp. 967-978, Oct. 1989.
[4] M. Brain and A. Tharp, "Perfect hashing using sparse matrix packing,"Inform. Syst., vol. 15, no. 3, pp. 281-290, Fall 1990.
[5] N. Cercone, M. Krause, and J. Boates, "Minimal and almost minimal perfect hash function search with application to natural language lexicon design,"Comput. Math. Appl., vol. 9, no. 1, pp. 215-231.
[6] R. Cichelli, "Minimal Perfect Hash Functions Made Simple,"Comm. ACM, Vol. 23, No. 1, Jan. 1980, pp. 17-19.
[7] C.C. Chang "The Study of an Ordered Minimal Perfect Hashing Scheme,"Comm. ACM, Vol. 27, No. 4, Apr. 1984, pp. 384-387.
[8] M. W. Du, T. M. Hseih, K. F. Jea, and D. W. Shieh, "The study of a new perfect hash scheme,"IEEE Trans. Software Eng., vol. 9, no. 3, pp. 305-313, May 1983.
[9] E. A. Fox, Q. F. Chen, L. S. Heath, and S. Datta, "A more cost effective algorithm for finding minimal perfect hash functions,"ACM Conf. Proc., 1989, pp. 114-122.
[10] E. Fox et al., "Practical Minimal Perfect Hash Functions for Large Databases,"Comm. ACM, Vol. 35, No. 1, Jan. 1992, pp. 105-121.
[11] E. Fredkin, "Trie memory,"CACM, vol. 3, pp. 490-499.
[12] M. L. Fredman and J. Komlos, "Storing a sparse table with O(1) worst case access time,"J. Ass. Comput. Mach., vol. 1, no. 3, pp. 538-544, July 1984.
[13] Y. S. Hsiao and A. L. Tharp, "Adaptive hashing,"Inform. Syst., vol. 13, no. 1, pp. 111-127, 1988.
[14] G. Jaeschke, "Reciprocal Hashing: A Method for Generating Minimal Perfect Hashing Functions,"Comm. ACM, Vol. 24, No. 12, Dec. 1981, pp. 829-833.
[15] K. Karplus and G. Haggard, "Finding minimal perfect hash functions," Comput. Sci. Dept., Cornell Univ., TR84-637, Sept. 1984.
[16] G. Haggard and K. Karplus, "Finding Minimal Perfect Hash Functions,"ACM SIGCSE Bull., Vol. 18. No. 1, Feb. 1986, pp. 191-193.
[17] D. E. Knuth,The Art of Computer Programming. Reading, MA: Addison-Wesley, 1973.
[18] P. A. Larson and A. Kajla, "File organization--Implementation of a method guaranteeing retrieval in one access,"Commun. ACM, vol. 27, no. 7, pp. 670-677, 1984.
[19] P. Larson and M. V. Ramakrishna, "External perfect hashing," inProc. ACM SIGMOD Conf., 1985, pp. 190-199.
[20] T. Lewis and C. Cook, "Hashing for dynamic and static internal tables,"Comput., pp. 45-56, Oct. 1988.
[21] M. V. Ramakrishna, "Perfect hashing for external files," Ph.D. dissertation, Univ. of Waterloo, Waterloo, Canada, 1986.
[22] T. Sager, "A Polynomial Time Generator for Minimal Perfect Hash Functions,"Comm. ACM, Vol. 28, No. 5, May 1985, pp. 523-532.
[23] R. Sprugnoli, "Perfect Hashing Functions: A Single Probe Retrieving Method for Static Sets,"Comm. ACM, Vol. 20, No. 11, Nov. 1977, pp. 841-850.
[24] R. E. Tarjan and A. C. Yao, "Storing a sparse table,"Commun. ACM, vol. 22, pp. 606-611, Nov. 1979.
[25] A. L. Tharp,File Organization and Processing, New York: Wiley, 1988.
[26] F. A. Williams, "Handling identifiers as internal symbols in language processors,"Commun. Assoc. Comput. Mach., vol. 6, no. 6, pp. 21-24, June 1959.
[27] W. Zoellick, "CD-ROM Software Development,"Byte, Vol. 11, 1986, pp. 177-188.

Index Terms:
file organisation; Unix; list processing; computational complexity; sparse matrix packing algorithm; retrieval function complexity; perfect hashing algorithms; array-based tries; Unix word list; large word lists; function building efficiency; pattern collision avoidance; ordered minimal perfect hashing function; item lists; ordering; sparse array packing
M.D. Brain, A.L. Tharp, "Using Tries to Eliminate Pattern Collisions in Perfect Hashing," IEEE Transactions on Knowledge and Data Engineering, vol. 6, no. 2, pp. 239-247, April 1994, doi:10.1109/69.277768
Usage of this product signifies your acceptance of the Terms of Use.