A. Moffat, Dept. of Comput. Sci., Melbourne Univ., Parkville, Vic., Australia
A. Turpin, Dept. of Comput. Sci., Melbourne Univ., Parkville, Vic., Australia
Minimum-redundancy coding (also known as Huffman (1952) coding) is one of the enduring techniques of data compression. We examine how best minimum-redundancy coding can be implemented, with particular emphasis on the situation when n is large, perhaps of the order of 10/sup 6/. We review techniques for devising minimum-redundancy codes, and consider in detail how encoding and decoding should be accomplished. In particular, we describe a modified decoding method that allows improved decoding throughput, requiring just a few machine operations per output symbol (rather than for each decoded bit), and uses just a few hundred bytes of memory above and beyond the space required to store an enumeration of the source alphabet. We review methods for calculating codeword lengths, show how those codeword lengths should be used to derive a minimum-redundancy code that has the alphabetic sequence property, and describes a memory-compact method for decoding such canonical codes. An improved method for decoding canonical codes is also presented.
Index Terms:
Huffman codes; redundancy; data compression; decoding; source coding; minimum redundancy prefix codes; Huffman coding; data compression; minimum-redundancy codes; decoding; modified decoding method; decoding throughput; machine operations; memory; source alphabet; codeword lengths; alphabetic sequence property; memory-compact method; canonical codes
Citation:
A. Moffat, A. Turpin, "On the implementation of minimum-redundancy prefix codes," dcc, pp.170, Data Compression Conference (DCC '96), 1996