This Article 
 Bibliographic References 
 Add to: 
Efficient Techniques for Clustering and Scheduling onto Embedded Multiprocessors
July 2006 (vol. 17 no. 7)
pp. 667-680

Abstract—Multiprocessor mapping and scheduling algorithms have been extensively studied over the past few decades and have been tackled from different perspectives. In the late 1980's, the two-step decomposition of scheduling—into clustering and cluster-scheduling—was introduced. Ever since, several clustering and merging algorithms have been proposed and individually reported to be efficient. However, it is not clear how effective they are and how well they compare against single-step scheduling algorithms or other multistep algorithms. In this paper, we explore the effectiveness of the two-phase decomposition of scheduling and describe efficient and novel techniques that aggressively streamline interprocessor communications and can be tuned to exploit the significantly longer compilation time that is available to embedded system designers. We evaluate a number of leading clustering and merging algorithms using a set of benchmarks with diverse structures. We present an experimental setup for comparing the single-step against the two-step scheduling approach. We determine the importance of different steps in scheduling and the effect of different steps on overall schedule performance and show that the decomposition of the scheduling process indeed improves the overall performance. We also show that the quality of the solutions depends on the quality of the clusters generated in the clustering step. Based on the results, we also discuss why the parallel time metric in the clustering step may not provide an accurate measure for the final performance of cluster-scheduling.

[1] T. Back, U. Hammel, and H.-P. Schwefel, “Evolutionary Computation: Comments on the History and Current State,” IEEE Trans. Evolutionary Computation, vol. 1, pp. 3-17, 1997.
[2] L. Benini and G. De Micheli, “Powering Networks on Chip,” Proc. Int'l System Synthesis Symp., Oct. 2001.
[3] R.C. Correa, A. Ferreira, and P. Rebreyend, “Scheduling Multiprocessor Tasks with Genetic Algorithms,” IEEE Trans. Parallel and Distributed Systems, vol. 10, pp. 825-837, 1999.
[4] M.D. Dikaiakos, A. Rogers, and K. Steiglitz, “A Comparison of Techniques Used for Mapping Parallel Algorithms to Message-Passing Multiprocessors,” Proc. Sixth IEEE Symp. Parallel and Distributed Processing, 1994.
[5] B.R. Fox and M.B. McMahon, “Genetic Operators for Sequencing Problems,” Foundations of Genetic Algorithms, 1991.
[6] A. Gerasoulis and T. Yang, “A Comparison of Clustering Heuristics for Scheduling Directed Graphs on Multiprocessors,” J. Parallel and Distributed Computing, vol. 16, pp. 276-291, 1992.
[7] D. Goldberg, Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley, 1989.
[8] E.S.H. Hou, N. Ansari, and H. Ren, “A Genetic Algorithm for Multiprocessor Scheduling,” IEEE Trans. Parallel and Distributed Systems, vol. 5, pp. 113-120, 1994.
[9] V. Kianzad and S.S. Bhattacharyya, “A Comparison of Clustering and Scheduling Techniques for Embedded Multiprocessor Systems,” Technical Report UMIACS-TR-2003-114, Inst. for Advanced Computer Studies, Univ. of Maryland at College Park, Dec. 2003.
[10] V. Kianzad and S.S. Bhattacharyya, “Multiprocessor Clustering for Embedded Systems,” Proc. European Conf. Parallel Computing, pp. 697-701, Aug. 2001.
[11] S.J. Kim and J.C. Browne, “A General Approach to Mapping of Parallel Computation upon Multiprocessor Architectures,” Proc. Int'l Conf. Parallel Processing, 1988.
[12] N. Koziris, M. Romesis, P. Tsanakas, and G. Papakonstantinou, “An Efficient Algorithm for the Physical Mapping of Clustered Task Graphs onto Multiprocessor Architectures,” Proc. Eighth Euromicro Workshop Parallel and Distributed Processing (PDP '00), pp. 406-413, 2000.
[13] Y. Kwok and I. Ahmad, “Benchmarking and Comparison of the Task Graph Scheduling Algorithms,” J. Parallel and Distributed Computing, vol. 59, no. 3, pp. 381-422, Dec. 1999.
[14] Y. Kwok and I. Ahmad, “Dynamic Critical Path Scheduling: An Effective Technique for Allocating Task Graphs to Multiprocessors,” IEEE Trans. Parallel and Distributed Systems, vol. 7, pp. 506-521, 1996.
[15] Y. Kwok and I. Ahmad, “Efficient Scheduling of Arbitrary Task Graphs to Multiprocessors Using a Parallel Genetic Algorithm,” J. Parallel and Distributed Computing, 1997.
[16] Y. Kwok and I. Ahmad, “Static Scheduling Algorithms for Allocating Directed Task Graphs to Multiprocessors,” ACM Computing Surveys, vol. 31, no. 4, pp. 406-471, Dec. 1999.
[17] R. Lepère and D. Trystram, “A New Clustering Algorithm for Scheduling Task Graphs with Large Communication Delays,” Proc. Int'l Parallel and Distributed Processing Symp., 2002.
[18] T. Lewis and H. El-Rewini, “Parallax: A Tool for Parallel Program Scheduling,” IEEE Parallel and Distributed Technology, vol. 1, no. 2, pp. 62-72, May 1993.
[19] G. Liao, G.R. Gao, E.R. Altman, and V.K. Agarwal, “A Comparative Study of DSP Multiprocessor List Scheduling Heuristics,” Proc. Hawaii Int'l Conf. System Sciences, 1994.
[20] P. Lieverse, E.F. Deprettere, A.C.J. Kienhuis, and E.A. De Kock, “A Clustering Approach to Explore Grain-Sizes in the Definition of Processing Elements in Dataflow Architectures,” J. VLSI Signal Processing, vol. 22, pp. 9-20, Aug. 1999.
[21] J.-C. Liou and M.A. Palis, “A Comparison of General Approaches to Multiprocessor Scheduling,” Proc. 11th Int'l Parallel Processing Symp. (IPPS), pp. 152-156, Apr. 1997.
[22] J.N. Morse, “Reducing the Size of the Nondominated Set: Pruning by Clustering,” Computers and Operations Research, vol. 7, nos. 1-2, pp. 55-66, 1980.
[23] P. Marwedel and G. Goossens, Code Generation for Embedded Processors. Kluwer Academic, 1995.
[24] C.L. McCreary, A.A. Khan, J.J. Thompson, and M.E. McArdle, “A Comparison of Heuristics for Scheduling DAGS on Multiprocessors,” Proc. Int'l Parallel Processing Symp., pp. 446-451, 1994.
[25] H. Printz, “Automatic Mapping of Large Signal Processing Systems to a Parallel Machine,” PhD thesis, School of Computer Science, Carnegie Mellon Univ., May 1991.
[26] A. Radulescu, A.J.C. van Gemund, and H.-X. Lin, “LLB: A Fast and Effective Scheduling Algorithm for Distributed-Memory Systems,” Proc. Int'l Parallel Processing and Symp. Parallel and Distributed Processing, pp. 525-530, 1999.
[27] V. Sarkar, Partitioning and Scheduling Parallel Programs for Multiprocessors. MIT Press, 1989.
[28] G.C. Sih, “Multiprocessor Scheduling to Account for Interprocessor Communication,” PhD dissertation, ERL, Univ. of California, Berkeley, Apr. 1991.
[29] T. Yang, “Scheduling and Code Generation for Parallel Architectures,” PhD thesis, Dept. of Computer Science, Rutgers Univ., May 1993.
[30] T. Yang and A. Gerasoulis, “PYRROS: States Scheduling and Code Generation for Message Passing Multiprocessors,” Proc. Sixth ACM Int'l Conf. Supercomputing, 1992.
[31] T. Yang and A. Gerasoulis, “DSC: Scheduling Parallel Tasks on an Unbounded Number of Processors,” IEEE Trans. Parallel and Distributed Systems, vol. 5, pp. 951-967, 1994.
[32] P. Wang and W. Korfhage, “Process Scheduling Using Genetic Algorithms,” IEEE Symp. Parallel and Distributed Processing, pp. 638-641, 1995.
[33] M.-Y. Wu and D.D. Gajski, “Hypertool: A Programming Aid for Message-Passing Systems,” IEEE Trans. Parallel and Distributed Systems, vol. 1, no. 3, pp. 330-343, July 1990.
[34] E. Zitzler, J. Teich, and S.S. Bhattacharyya, “Optimized Software Synthesis for DSP Using Randomization Techniques,” technical report, Computer Eng. and Comm. Networks Laboratory, Swiss Federal Inst. of Technology, Zurich July 1999.
[35] A.Y. Zomaya, C. Ward, and B. Macey, “Genetic Scheduling for Parallel Processor Systems: Comparative Studies and Performance Issues,” IEEE Trans. Parallel and Distributed Systems, vol. 10, pp. 795-812, 1999.

Index Terms:
Interprocessor communication, multiprocessor systems, scheduling, task partitioning.
Vida Kianzad, Shuvra S. Bhattacharyya, "Efficient Techniques for Clustering and Scheduling onto Embedded Multiprocessors," IEEE Transactions on Parallel and Distributed Systems, vol. 17, no. 7, pp. 667-680, July 2006, doi:10.1109/TPDS.2006.87
Usage of this product signifies your acceptance of the Terms of Use.