
This Article  
 
Share  
Bibliographic References  
Add to:  
Digg Furl Spurl Blink Simpy Del.icio.us Y!MyWeb  
Search  
 
ASCII Text  x  
G. N. Srinivasa Prasanna, B. R. Musicus, "Generalized Multiprocessor Scheduling and Applications to Matrix Computations," IEEE Transactions on Parallel and Distributed Systems, vol. 7, no. 6, pp. 650664, June, 1996.  
BibTex  x  
@article{ 10.1109/71.506703, author = {G. N. Srinivasa Prasanna and B. R. Musicus}, title = {Generalized Multiprocessor Scheduling and Applications to Matrix Computations}, journal ={IEEE Transactions on Parallel and Distributed Systems}, volume = {7}, number = {6}, issn = {10459219}, year = {1996}, pages = {650664}, doi = {http://doi.ieeecomputersociety.org/10.1109/71.506703}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, }  
RefWorks Procite/RefMan/Endnote  x  
TY  JOUR JO  IEEE Transactions on Parallel and Distributed Systems TI  Generalized Multiprocessor Scheduling and Applications to Matrix Computations IS  6 SN  10459219 SP650 EP664 EPD  650664 A1  G. N. Srinivasa Prasanna, A1  B. R. Musicus, PY  1996 KW  DAG scheduling KW  multiprocessor scheduling KW  multiprocessor compilation KW  parallel processing KW  communication locality KW  task scheduling KW  distributedmemory multiprocessors. VL  7 JA  IEEE Transactions on Parallel and Distributed Systems ER   
Abstract—This paper considerably extends the multiprocessor scheduling techniques in [1], [2], and applies it to matrix arithmetic compilation. In [1], [2] we presented several new results in the theory of homogeneous multiprocessor scheduling. A directed acyclic graph (DAG) of tasks is to be scheduled. Tasks are assumed to be parallelizable—as more processors are applied to a task, the time taken to compute it decreases, yielding some speedup. Because of communication, synchronization, and task scheduling overhead, this speedup increases less than linearly with the number of processors applied. The optimal scheduling problem is to determine the number of processors assigned to each task, and task sequencing, to minimize the finishing time.
Using optimal control theory, in the special case where the speedup function of each task is
The algorithm has been tested on a variety of DAGs commonly encountered in matrix arithmetic. The results show that if the
[1] G.N.S. Prasanna and B.R. Musicus, "The Optimal Control Approach to Generalized Multiprocessor Scheduling," Algorithmica, 1995.
[2] G.N.S. Prasanna and B.R. Musicus, "Generalised Multiprocessor Scheduling Using Optimal Control," Third Ann. ACM Symp. Parallel Algorithms and Architectures, pp. 216228, July 1991.
[3] G.N.S. Prasanna, A. Agarwal, and B.R. Musicus, "Hierarchical Compilation of Macro Dataflow Graphs for Multiprocessors with Local Memory," IEEE Trans. Parallel and Distributed Systems, vol. 5, no. 7, pp. 720736, July 1994.
[4] E.F. Coffman, Jr. ed., Computer and Job Shop Scheduling Theory.New York: John Wiley and Sons, 1976.
[5] W.H. Press, B.P. Flannery, S.A. Teukolsky, and W.T. Vetterling, Numerical Recipes, the Art of Scientific Computing. Cambridge, Mass.: Cambridge Univ. Press, 1986.
[6] A. Agarwal et al., The MIT Alewife Machine: A LargeScale DistributedMemory Multiprocessor, "Workshop on Scalable Shared Memory Multiprocessors." Kluwer Academic Publishers, 1991. Also MIT/LCS Memo TM454, 1991.
[7] V. Sarkar, "Partitioning and Scheduling Programs for Multiprocessors," Technical Report CSLTR87328, PhD Thesis, Computer Systems Lab., Stanford University, April 1987.
[8] T. Yang and A. Gerasoulis, "A Fast Static Scheduling Algorithm for DAGs on an Unbounded Number of Processors," Proc. Supercomputing, pp. 633642,Albuquerque, N.M., Nov. 1991.
[9] K.P. Belkhale and P. Banerjee, "Scheduling Algorithms for Parallelizable Tasks," Int'l Parallel Processing Symp., June 1993.
[10] S. Ramaswamy and P. Banerjee, "Processor Allocation and Scheduling of Macro Dataflow Graphs on Distributed Memory Multicomputers by The Paradigm Compiler," Int'l Conf. Parallel Processing, pp. 134138, Aug. 1993.
[11] S. Ramaswamy, S. Saptenekar, and P. Banerjee, "A Convex Programming Approach for Exploiting Data and Functional Parallelism on Distributed Memory Multicomputers," Int'l Conf. Parallel Processing, Aug. 1994.
[12] J. Blazerwicz,M. Drabowski,, and j. Weglarz,“Scheduling multiprocessor tasks to minimize schedule length,” IEEE Trans. on Computers, vol. 35, no. 5, May 1986.
[13] J. Du and J. Leung,“Complexity of scheduling parallel task systems,”SIAM J. Discrete Math., vol. 2 no. 4, pp. 473–487, Nov. 1989.
[14] C.C Han and K.J. Lin, "Scheduling Parallelizable Jobs on Multiprocessors," IEEE Conf. on RealTime Systems, pp. 5967, 1989.
[15] C. McCreary,A.A. Khan,J.J. Thompson, and M.E. McArdle,"A comparison of heuristics for scheduling DAGs on multiprocessors," Proc. Eighth Int'l Parallel Processing Symp., pp. 446451, 1994.
[16] J. Baxter and J.H. Patel, "The Last Algorithm: A HeuristicBased Static Task Allocation Algorithm," Proc. Int'l Conf. Parallel Processing, vol. 2, pp. 217222, 1989.
[17] T.C. Hu, "Parallel Sequencing and Assembly Line Problems," Operations Research, vol. 9, no. 6, pp. 841848, 1961.
[18] D. Chaiken and A. Agarwal, "SoftwareExtended Coherent Shared Memory—Performance and Cost," TwentyFirst Annual Int'l Symp. Computer Arch., (ISCA 21), ACM, April 1994.