This Article 
 Bibliographic References 
 Add to: 
On the Granularity and Clustering of Directed Acyclic Task Graphs
June 1993 (vol. 4 no. 6)
pp. 686-701

The authors consider the impact of the granularity on scheduling task graphs. Schedulingconsists of two parts, the processors assignment of tasks, also called clustering, and theordering of tasks for execution in each processor. The authors introduce two types of clusterings: nonlinear and linear clusterings. A clustering is nonlinear if two parallel tasksare mapped in the same cluster otherwise it is linear. Linear clustering fully exploits thenatural parallelism of a given directed acyclic task graph (DAG) while nonlinear clustering sequentializes independent tasks to reduce parallelism. The authors also introduce a new quantification of the granularity of a DAG and define a coarse grain DAG as the one whose granularity is greater than one. It is proved that every nonlinear clustering of a coarse grain DAG can be transformed into a linear clustering that has less or equal parallel time than the nonlinear one. This result is used to prove the optimality of some important linear clusterings used in parallel numerical computing.

[1] F. D. Anger, J. Hwang, and Y. Chow, "Scheduling with sufficient loosely coupled processors,"J. Parallel and Distributed Comput., vol. 9, pp. 87-92, 1990.
[2] D. Callahan and K. Kennedy, "Compiling programs for distributed-memory multi-processors,J. Supercomput., vol. 2, pp. 151-169, 1988.
[3] Ph. Chretienne, "Task scheduling over distributed memory machines," inProc. Int. Workshop Parallel and Distributed Algorithms, North Holland, Ed., 1989.
[4] M. Cosnard, M. Marrakchi, Y. Robert, and D. Trystram, "Parallel Gaussian elimination on an MIMD computer,"Parallel Comput., vol. 6, pp. 275-296, 1988.
[5] J.J. Dongarra et al., "A Set of Level 3 Basic Linear Algebra Subprograms,"ACM Trans. Math. Software, Vol. 16, No. 1, 1990, pp. 1-17.
[6] T. H. Dunigan, "Performance of a second generation hypercube," ORNL/TM-10 881, Oak Ridge National Laboratory, TN, Nov. 1988.
[7] A. Gerasoulis and I. Nelken, "Static scheduling for linear algebra DAG's," inProc. 4th Conf. Hypercubes, Monterey, vol. 1, 1989, pp. 671-674.
[8] A. Gerasoulis and T. Yang, "A comparison of clustering heuristics for scheduling DAG's on multiprocessors,"J. Parallel and Distributed Comput., Special issues on scheduling and load balancing, vol. 16, no. 4, pp. 276-291, Dec. 1992.
[9] A. Gerasoulis and S. Venugopal, "Linear clustering of linear algebra task graphs for local memory systems," Report, 1990.
[10] G. H. Golub and C. F. Van Loan,Matrix Computations. Baltimore, MD: Johns Hopkins, 1989.
[11] M.T. Heath and C. H. Romine, "Parallel solution of triangular systems on distributed-memory multiprocessors,"SIAM J. Sci. Statist. Comput., vol. 9, no. 3, pp. 558-600, 1988.
[12] J. A. Hoogeveen, S. L. van de Velde, and B. Veltman, "Complexity of scheduling multiprocessor tasks with prespecified processor allocations," CWI, Rep. BS-R9211 June 1992, Netherlands.
[13] S. J. Kim and J. C. Browne, "A general approach to mapping of parallel computation upon multiprocessor architectures," inProc. Int. Conf. Parallel Processing, vol 3, 1988, pp. 1-8.
[14] B. Kruatrachue and T. Lewis, "Grain size determination for parallel processing,"IEEE Software, pp. 23-32, Jan. 1988.
[15] S.Y. Kung,VLSI Array Processors, Prentice Hall, Englewood Cliffs, N.J. 1988.
[16] C. McCreary and H. Gill, "Automatic determination of grain size for efficient parallel processing,"Commun. ACM, vol. 32, pp. 1073-1078, Sept., 1989.
[17] J. M. Ortega,Introduction to Parallel and Vector Solution of Linear Systems. New York: Plenum, 1988.
[18] C. Papadimitriou and M. Yannakakis, "Toward an architecture-independent analysis of parallel algorithms,"SIAM J. Comput., vol. 19, no. 2, pp. 322-328, Apr. 1990.
[19] C. Picouleau, "New complexity results on the UET-UCT scheduling algorithms," inProc. Summer School on Scheduling Theory and its Applications, Chateau De Bonas, France, 1992, pp. 487-502.
[20] Y. Robert, B. Tourancheu, and G. Villard, "Data allocation strategies for the Gauss and Jordan algorithms on a ring of processors,"Inform. Processing Lett., vol. 31, pp. 21-29, 1989.
[21] Y. Saad, "Gaussian elimination on hypercubes," inParallel Algorithms and Architectures, M. Cosnardet al., Eds. Amsterdam: Elsevier Science Publishers, North-Holland, 1986.
[22] I. C. F. Ipsen, Y. Saad, and M. Schultz, "Complexity of dense linear system solution on a multiprocessor ring,"Linear Algebra and Appl., vol. 77, pp. 205-239, 1986.
[23] V. Sarkar,Partitioning and Scheduling Parallel Programs for Multiprocessing, MIT Press, 1989.
[24] Stone, H. S. 1987.High-Performance Computer Architecture. Reading, Mass., Addison-Wesley.
[25] M. Wu and D. Gajski, "A programming aid for hypercube architectures,"J. Supercomputing, vol. 2, pp. 349-372, 1988.
[26] T. Yang and A. Gerasoulis, "A fast static scheduling algorithm for DAG's on an unbounded number of processors," inProc. IEEE Supercomputing '91, IEEE, Albuquerque, NM, Nov. 1991, pp. 633-642.
[27] T. Yang and A. Gersoulis, "Pyrros: Static Task Scheduling and Code Generation for Message-Passing Multiprocessors,"Proc. 6th ACM Int'l Conf. Supercomputing, ACM Press, New York, 1992, pp. 428-443.

Index Terms:
Index Termsscheduling; task graphs; clusterings; directed acyclic task graph; DAG; granularity; graphtheory; parallel algorithms; scheduling
A. Gerasoulis, T. Yang, "On the Granularity and Clustering of Directed Acyclic Task Graphs," IEEE Transactions on Parallel and Distributed Systems, vol. 4, no. 6, pp. 686-701, June 1993, doi:10.1109/71.242154
Usage of this product signifies your acceptance of the Terms of Use.