|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
| ASCII Text | x | ||
| D.M. Nicol, D.R. O'Hallaron, "Improved Algorithms for Mapping Pipelined and Parallel Computations," IEEE Transactions on Computers, vol. 40, no. 3, pp. 295-306, March, 1991. | |||
| BibTex | x | ||
| @article{ 10.1109/12.76406, author = {D.M. Nicol and D.R. O'Hallaron}, title = {Improved Algorithms for Mapping Pipelined and Parallel Computations}, journal ={IEEE Transactions on Computers}, volume = {40}, number = {3}, issn = {0018-9340}, year = {1991}, pages = {295-306}, doi = {http://doi.ieeecomputersociety.org/10.1109/12.76406}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - JOUR JO - IEEE Transactions on Computers TI - Improved Algorithms for Mapping Pipelined and Parallel Computations IS - 3 SN - 0018-9340 SP295 EP306 EPD - 295-306 A1 - D.M. Nicol, A1 - D.R. O'Hallaron, PY - 1991 KW - pipelined computations; linear array systems; shared memory systems; time complexities; parallel computations; host-satellite systems; computation module execution times; intermodule communication times; homogeneity constraints; space complexities; parallel mapping algorithms; computational complexity; parallel algorithms. VL - 40 JA - IEEE Transactions on Computers ER - | |||
Recent work on the problem of mapping pipelined or parallel computations onto linear array, shared memory, and host-satellite systems is extended. It is shown how these problems can be solved even more efficiently when computation module execution times are bounded from below, intermodule communication times are bounded from above, and the processors satisfy certain homogeneity constraints. The improved algorithms have significantly lower time and space complexities than the more general algorithms: in one case, an O(nm/sup 3/) time algorithm for mapping m modules onto n processors is replaced with an O(nm log m) time algorithm, and the space requirements are reduced from O(nm/sup 2/) to O(m). Run-time complexity is reduced further with parallel mapping algorithms based on these improvements, which run on the architectures for which they create mappings.
[1] M. Annaraton, E. Arnould, T. Gross, H. Kung, M. Lam, O. Menzilcioglu, and J. Webb, "The Warp computer: Architecture, implementation, and performance,"IEEE Trans. Comput., vol. C-36, pp. 1523-1538, Dec. 1987.
[2] A. Baum and D. McMillian, "Automated parallelization of serial simulations for hypercube parallel processors," inDistributed Simulation 1989, SCS Simulation Series, 1989, pp. 131-136.
[3] M. Berger and S. H. Bokhari, "A partitioning strategy for nonuniform problems on multiprocessors,"IEEE Trans. Comput., vol. C-36, pp. 570-580, May 1987.
[4] S. H. Bokhari, "Partitioning problems in parallel, pipelined, and distributed computing,"IEEE Trans. Comput., vol. 37, pp. 48-57, Jan. 1988.
[5] S. H. Bokhari, "A shortest tree algorithm for optimal assignments across space and time in a distsbuted processor system,"IEEE Trans Software Eng., vol. SE-7, pp. 583-589, Nov. 1981.
[6] Z. Cvetanovic, "The effect of problem partitioning, allocation, and granularity on the performance of multiple-processor systems,"IEEE Trans. Comput., vol. C-36, Apr. 1987.
[7] M. Iqbal, "Approximate algorithms for partitioning and assignment problems," Tech. Rep. 86-40, ICASE, June 1986. Available from ICASE, NASA Langley Research Center, Hampton, VA 23665.
[8] G. Kernighan, "Optimal sequential partitions of graphs,"J. ACM, vol. 18, no. 1, pp. 34-40, Jan. 1971.
[9] R. Kincaid, D. Nicol, D. Shier, and D. Richards, "A multistage linear array assignment problem,"Oper. Res., 1991, to be published.
[10] D. E. Knuth,The Art of Computer Programming, Vol. 1. Reading, MA: Addison-Wesley, 1973.
[11] M. Noga, "Sorting in parallel by double distributed partitioning,"BIT, vol. 27, no. 3, pp. 340-348, 1987.
[12] D. A. Reed, L. M. Adams, and M. L. Patrick, "Stencils and problem partitionings: Their influence on the performance of multiple processor systems,"IEEE Trans. Comput., vol. C-36, pp. 845-858, July 1987.
[13] P. Sadayappan and F. Ercal, "Nearest-neighbor mappings of finite element graphs onto processor meshes,"IEEE Trans. Comput., vol. C-36, pp. 1408-1424, Dec. 1987.
[14] J. Saltz, V. K. Naik, and D. Nicol, "Reduction of the effects of communication delays in scientific algorithms on message passing MIMD architectures,"SIAM J. Sci. Stat. Comput., vol. 8, no. 1, pp. s118-s134, 1987.
[15] H. Stone, "Critical load factors in distributed computer systems,"IEEE Trans. Software Eng., vol. SE-4, pp. 254-258, May 1978.
[16] H. Stone, "Multiprocessor scheduling with the aid of network flow algorithms,"IEEE Trans. Software Eng., vol. SE-3, pp. 85-93, Jan. 1977.
[17] D. Towsley, "Allocating programs containing branches and loops within a multiple processor system,"IEEE Trans. Software Eng., vol. SE-12, pp. 1018-1024, Oct. 1986.
[18] M. Yang, J. Huan, and Y. Chow, "Optimal parallel sorting scheme by order statistics,"SIAM J. Comput., vol. 16, pp. 990-1003, Dec. 1987.

