This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Optimal Processor Assignment for a Class of Pipelined Computations
April 1994 (vol. 5 no. 4)
pp. 439-445

The availability of large-scale multitasked parallel architectures introduces the followingprocessor assignment problem. We are given a long sequence of data sets, each of whichis to undergo processing by a collection of tasks whose intertask data dependencies forma series-parallel partial order. Each individual task is potentially parallelizable, with aknown experimentally determined execution signature. Recognizing that data sets can bepipelined through the task structure, the problem is to find a "good" assignment ofprocessors to tasks. Two objectives interest us: minimal response time per data set,given a throughput requirement, and maximal throughput, given a response timerequirement. Our approach is to decompose a series-parallel task system into its essential"serial" and "parallel" components; our problem admits the independent solution andrecomposition of each such component. We provide algorithms for the series analysis, and use an algorithm due to Krishnamurti and Ma for the parallel analysis. For a p processor system and a series-parallel precedence graph with n constituent tasks, we give a O(np/sup 2/) algorithm that finds the optimal assignment (over a broad class ofassignments) for the response time optimization problem; we find the assignmentoptimizing the constrained throughput in O(np/sup 2/ log p) time. These techniques areapplied to a task system in computer vision.

[1] M. Berger and S. H. Bokhari, "A partitioning strategy for nonuniform problems on multiprocessors,"IEEE Trans. Comput., vol. C-36, pp. 570-580, May 1987.
[2] J. Blazewicz, M. Drabowski, and J. Weglarz, "Scheduling multiprocessor tasks to minimize schedule length,"IEEE Trans. Comput., vol. C-35, pp. 389-393, May 1986.
[3] S. H. Bokhari, "A shortest tree algorithm for optimal assignments across space and time in a distributed processor system,"IEEE Trans. Software Eng., vol. SE-7, no. 6, pp. 583-589, Nov. 1981.
[4] S. H. Bokhari, "Partitioning problems in parallel, pipelined, and distributed computing,"IEEE Trans. Comput., vol. 37, pp. 48-57, Jan. 1988.
[5] L. Bomans and D. Roose, "Benchmarking the {iPSC/2} hypercube multiprocessor,"Concurrency: Practice and Experience, vol. 1, pp. 3-18, Sept. 1989.
[6] M. Y. Chan and F. Y. L. Chin, "On embedding rectangular grids in hypercubes,"IEEE Trans. Comput., vol. 37, pp. 1285-1288, Oct. 1988.
[7] H-A. Choi and B. Narahari, "Algorithms for mapping and partitioning chain structured parallel computations,"Proc. 1991 Int. Conf. Parallel Processing, 1991, pp. 625-628.
[8] Choudhary, A.N., and J.H. Patel,Parallel Architectures and Parallel Algorithms for Integrated Vision Systems, Kluwer Academic Publishers, Boston, 1990.
[9] E. Denardo, "Dynamic Programming: Models and Applications. Englewood Cliffs, NJ: Prentice-Hall, 1982.
[10] K. Dussa, B. Carlson, L. Dowdy, and K.-H. Park, "Dynamic partitioning in transputer environments,"Proc. ACM SIGMETRICS Conf., 1990, pp. 203-213.
[11] J. Du and J. Y-T. Leung, "Complexity of scheduling parallel task systems,"SIAM J. Disc. Math., vol. 2, no. 4, pp. 473-487, Nov. 1989.
[12] B. Fox, "Discrete optimization via marginal analysis,"Management Sci., vol. 13, pp. 909-918, May 1974.
[13] M. Foxet al., Solving Problems on Concurrent Processors, vol. 1. Englewood Cliffs, NJ: Prentice-Hall, 1988.
[14] J. P. Hayes, T. N. Mudge, Q. F. Stout, and S. Colley, "Architecture of a hypercube supercomputer,"Proc. 1986 Int. Conf. Parallel Processing, 1986, pp. 653-660.
[15] C.-T. Ho and S. L. Johnsson, "On the embedding of arbitrary meshes in Boolean cubes with expansion two dilation two,"Proc. 1987 Int. Conf. Parallel Processing, 1987, pp. 188-191.
[16] E. Horowitz and S. Sahni,Fundamentals of Computer Algorithms, Ch. 2. New York: Computer Science Press, 1985.
[17] O. H. Ibarra and S. M. Sohn, "On mapping systolic algorithms onto the hypercube,"IEEE Trans. Parallel Distrib. Syst., vol. 1, pp. 48-63, Jan. 1990.
[18] R. Kincaid, D. M. Nicol, D. Shier, and D. Richards, "A multistage linear array assignment problem,"Operations Res., vol. 38, pp. 993-1005, Nov.-Dec. 1990.
[19] C.-T. King, W.-H. Chou, and L. M. Ni, "Pipelined data-parallel algorithms,"IEEE Trans. Parallel Distrib. Syst., vol. 1, pp. 470-499, Oct. 1990.
[20] R. Krishnamurti and Y. E. Ma, "The processor partitioning problem in special-purpose partitionable systems,"Proc. 1988 Int. Conf. Parallel Processing, 1988, vol. 1, pp. 434-443.
[21] M. K. Leung and T. S. Huang, "Point matching in a time sequence of stereo image pairs," Tech. Rep., CSL, Univ. of Ill. at Urbana-Champaign, Urbana, IL, 1987.
[22] W. N. Martin and J. K. Aggarwal, Eds.Motion Understanding, Robot and Human Vision. Boston: Kluwer, 1988.
[23] R. G. Melhem and G.-Y. Hwang, "Embedding rectangular grids into square grids with dilation two,"IEEE Trans. Comput., vol. 39, pp. 1446-1455, Dec. 1990.
[24] D. M. Nicol and D. R. O'Hallaron, "Improved algorithms for mapping parallel and pipelined computations,"IEEE Trans. Comput., vol. 40, pp. 295-306, Mar. 1991.
[25] C. D. Polychronopoulos, D. J. Kuck, and D. A. Padua, "Utilizing multi-dimensional loop parallelism on large scale parallel processor systems,"IEEE Trans. Comput., vol. 38, pp. 1285-1296, Sept. 1989.
[26] P. Sadayappan and F. Ercal, "Nearest-neighbor mappings of finite element graphs onto processor meshes,"IEEE Trans. Comput., vol. C-36, pp. 1408-1424, Dec. 1987.
[27] D. S. Scott and R. Brandenburg, "Minimal mesh embeddings in binary hypercubes,"IEEE Trans. Comput., vol. 37, pp. 1284-1285, Oct. 1988.
[28] K. C. Sevcik, "Characterization of parallelism in applications and their use in scheduling,"ACM SIGMETRICS, pp. 171-180, 1989.
[29] H. J. Siegel, L. J. Siegel, F.C. Kemmerer, P. T. Mueller, H. E. Smalley, and S. D. Smith, "PASM: A partitionable SIMD/MIMD system for image processing and pattern recognition,"IEEE Trans. Comput., vol. C-30, no. 12, pp. 934-947, Dec. 1981.
[30] C. V. Stewart and C. R. Dyer, "Scheduling algorithms for PIPE (pipelined image-processing engine),"J. Parallel Distrib. Computing, vol. 5, pp. 131-153, 1988.
[31] H. Stone, "Multiprocessor scheduling with the aid of network flow algorithms,"IEEE Trans. Software Eng., vol. SE-3, no. 1, pp. 85-93, Jan. 1977.
[32] H. S. Stone, J. Turek, and J. L. Wolf, "Optimal partitioning of cache memory,"IEEE Trans. Comput., vol. 41, pp. 1054-1068, Sept. 1992.
[33] D. Towsley, "Allocating programs containing branches and loops within a multiple processor system,"IEEE Trans. Software Eng., vol. SE-12, pp. 1018-1024, Oct. 1986.
[34] J. Valdes, R. E. Tarjan, and E. L. Lawler, "The recognition of series parallel digraphs,"SIAM J. Comput., vol. 11, no. 2, pp. 298-313, May 1982.
[35] C. Weems et al., "The DARPA Image Understanding Benchmark for Parallel Computers,"J. Parallel and Distributed Computing, Jan. 1991, pp. 1-24.

Index Terms:
Index Termspipeline processing; resource allocation; parallel architectures; pipelined computations;multitasked parallel architectures; processor assignment problem; data dependencies;series-parallel partial order; computer vision; parallel analysis; data sets; task structure;series-parallel task system; series analysis
Citation:
A.N. Choudhary, B. Narahari, D.M. Nicol, R. Simha, "Optimal Processor Assignment for a Class of Pipelined Computations," IEEE Transactions on Parallel and Distributed Systems, vol. 5, no. 4, pp. 439-445, April 1994, doi:10.1109/71.273050
Usage of this product signifies your acceptance of the Terms of Use.