This Article 
 Bibliographic References 
 Add to: 
Improving Quicksort Performance with a Codeword Data Structure
May 1989 (vol. 15 no. 5)
pp. 622-631

The problem is discussed of how the use of a new data structure, the codeword structure, can help improve the performance of quicksort when the records to be sorted are long and the keys are alphanumeric sequences of bytes. The codeword is a compact representation of a key with respect to some codeword generator. It consists of a byte for a character count of equal bytes, a byte for the first nonequal byte, and a pointer to the record. It is shown how the ordering of keys is preserved by an adequate choice of the code generator and how this can be applied to the quicksort algorithm. An analysis of the potential saving son various architectures and actual measurements shows the improvements that can be attained by using codewords rather than pointers. Architecturally independent parameters, such as the number of bytes to be compared, the number of swaps, architecture-dependent parameters such as caches and their write policies, and compiler optimizations such as in-line expansion and register allocation are considered.

[1] IBM 370/XA Sort Instructions, IBM, Inc., 1986.
[2] J. L. Baer,Computer Systems Architecture. Potomac, MD: Computer Science Press, 1980.
[3] J. Cho, A. J. Smith, and H. Sachs, "The memory architecture and the cache and memory management unit for the Fairchild clipper processor," Comput. Sci. Div., Univ. Calif., Berkeley, CA, Tech. Rep. UCB/CSD 86/289, Apr. 1986.
[4] D. W. Dobberpuhl, R. M. Supnik, and R. T. Witek, "The MicroVAX 78032 chip, A 32-bit microprocessor,"Digital Tech. J., vol. 2, Feb. 1986.
[5] D. E. Knuth,The Art of Computer Programming, Vol. 3, Reading, MA: Addison-Wesley, 1973.
[6] D. A. Patterson, "Reduced instruction set computers,"Commun. ACM, vol. 28, pp. 8-21, Jan. 1985.
[7] G. Radin, "The 801 minicomputer,"IBM J. Res. Dev., vol. 27, no. 3, May 1983.
[8] A. Smith, "Cache Memories,"Computing Surveys, Vol. 14, No. 3, Sept. 1982, pp. 473- 530.

Index Terms:
performance improvement; codeword data structure; records; long; keys; alphanumeric sequences; bytes; codeword generator; character count; first nonequal byte; pointer; ordering; quicksort algorithm; swaps; architecture-dependent parameters; caches; write policies; compiler optimizations; in-line expansion; register allocation; data structures; sorting
J.-L. Baer, Y.-B. Lin, "Improving Quicksort Performance with a Codeword Data Structure," IEEE Transactions on Software Engineering, vol. 15, no. 5, pp. 622-631, May 1989, doi:10.1109/32.24711
Usage of this product signifies your acceptance of the Terms of Use.