This Article 
 Bibliographic References 
 Add to: 
Parallel Quicksort Using Fetch-And-Add
January 1990 (vol. 39 no. 1)
pp. 133-138

A parallelization of the Quicksort algorithm that is suitable for execution on a shared memory multiprocessor with an efficient implementation of the fetch-and-add operation is presented. The partitioning phase of Quicksort, which has been considered a serial bottleneck, is cooperatively executed in parallel by many processors through the use of fetch-and-add. The parallel algorithm maintains the in-place nature of Quicksort, thereby allowing internal sorting of large arrays. A class of fetch-and-add-based algorithms for dynamically scheduling processors to subproblems is presented. Adaptive scheduling algorithms in this class have low overhead and achieve effective processor load balancing. The basic algorithm is shown to execute in an average of O(log(N)) time on an N-processor PRAM (parallel random-access machine) assuming a constant-time fetch-and-add. Estimated speedups, based on simulations, are also presented for cases when the number of items to be sorted is much greater than the number of processors.

[1] S. G. Akl,Parallel Sorting Algorithms. Orlando, FL: Academic, 1985.
[2] D. Bitton, D. J. Dewitt, D. K. Hsiao, and J. Menon, "A taxonomy of parallel sorting,"Comput. Surveys, vol. 16, pp. 287-318, 1984.
[3] R. Cole, "Parallel merge sort," inProc. 27th Annu. Symp. Foundations Comput. Sci., IEEE Computer Soc. Press, 1986, pp. 511-516.
[4] J. Deminet, "Experience with multiprocessor algorithms,"IEEE Trans. Comput., vol. C-31, no. 4, pp. 278-288, 1982.
[5] D. J. Evans and R. C. Dunbar, "The parallel Quicksort algorithm Part 1--Run time analysis,"Int. J. Comput. Math., vol. 12, pp. 19-55, 1982.
[6] D. J. Evans and R. C. Dunbar, "The parallel Quicksort algorithm Part 2--Simulation,"Int. J. Comput. Math., vol. 12, pp. 125-133, 1982.
[7] D. J. Evans and Y. Yousif, "Analysis of the peformance of the parallel quicksort method,"BIT, vol. 25, pp. 106-112, 1985.
[8] A. Gottlieb, R. Grishman, C. P. Kruskal, K. P. McAuliffe, L. Rudolph, and M. Snir, "The NYU Ultracomputer--Designing a MIMD shared memory parallel machine,"IEEE Trans. Comput., vol. C-32, no. 2, pp. 175-189, 1982.
[9] A. Gottlieb, B. D. Lubachevsky, and L. Rudolph, "Basic techniques for the efficient coordination of very large numbers of cooperating sequential processors,"ACM Trans. Programming Languages Syst., vol. 5, no. 2, pp. 164-189, Apr. 1993.
[10] A. Gottlieb and J. T. Schwartz, "Networks and algorithms for very large scale parallel computations,"Comput. Mag., vol. 15, no. 1, pp. 27-36, 1982.
[11] W. D. Hillis and G. L. Steele, Jr., "Data parallel algorithms,"Commun. ACM, vol. 29, no. 12, pp. 1170-1183, Dec. 1986.
[12] C. P. Kruskal, "Algorithms for replace-add based paracomputers," inProc. 1982 Int. Conf. Parallel Processing, IEEE Computer Soc. Press, 1982, pp. 219-223.
[13] G. Lee, C. P. Kruskal, and D. J. Kuck, "The effectiveness of combining in shared memory parallel computers in the presence of "hot spots," inProc. 1986 Int. Conf. Parallel Processing, IEEE Computer Soc. Press, pp. 35-41.
[14] G. F. Pfister, W. C. Brantley, D. A. George, S. L. Harvey, W. J. Kleinfelder, K. P. McAuliffe, E. A. Melton, V. A. Norton, and J. Weiss, "The IBM Research parallel processor prototype (RP3): Introduction and architecture," inProc. 1985 Int. Conf. Parallel Processing, IEEE Computer Soc. Press, 1985, pp. 764-771.
[15] G. F. Pfister and V. A. Norton, "Hot spot contention and combining in multistage interconnection networks,"IEEE Trans. Comput., vol. C-34, no. 10, pp. 943-948, 1985.
[16] G. F. Pfister, "A straightforward parallel equi-join," inIBM Tech. Disclos. Bull., to be published.
[17] M. J. Quinn, "Parallel sorting algorithms for tightly coupled multiprocessors,"Parallel Comput., vol. 6, pp. 349-357, 1988.
[18] R. Reischuk, "Probabilistic parallel algorithms for sorting and selection,"SIAM J. Comput., vol. 14, no. 2, pp. 396-409, 1985.
[19] J. H. Reif and L. G. Valiant, "A logarithmic time sort for linear size networks,"J. ACM, vol. 34, no. 1, Jan. 1987.
[20] R. Rettberg and R. Thomas, "Contention Is No Obstacle to Shared-Memory Multiprocessing,"Comm. ACM, Vol. 29, No. 12, Dec. 1986, pp. 1202-1212.
[21] J. T. Robinson, "Some analysis techniques for asynchronous multiprocessor algorithms,"IEEE Trans. Software Eng., vol. SE-5, no. 1, pp. 24-31, 1979.
[22] J. M. Robson, "The height of binary search trees,"Australian Comput. J., vol. 11, no. 4, pp. 151-153, 1979.
[23] R. Sedgewick, "Implementing quicksort programs,"Commun. ACM, vol. 21, pp. 847-856, Oct. 1978.
[24] M. Snir, "On parallel searching,"Proc. ACM Symp. Distrib. Comput., 1982, pp. 242-253.
[25] H. S. Stone,High-Performance Computer Architecture. Reading, MA: Addison-Wesley.
[26] P.-C. Yew, N.-F. Tzeng, and D.H. Lawrie, "Distributing hot-spot addressing in large-scale multiprocessors,"IEEE Trans. Comput., vol. C- 36, pp. 388-395, Apr. 1987.

Index Terms:
fetch-and-add; parallelization; Quicksort algorithm; shared memory multiprocessor; partitioning phase; parallel algorithm; sorting; scheduling; N-processor PRAM; simulations; parallel algorithms; sorting.
P. Heidelberger, A. Norton, J.T. Robinson, "Parallel Quicksort Using Fetch-And-Add," IEEE Transactions on Computers, vol. 39, no. 1, pp. 133-138, Jan. 1990, doi:10.1109/12.46289
Usage of this product signifies your acceptance of the Terms of Use.