This Article 
 Bibliographic References 
 Add to: 
Mapping and Load-Balancing Iterative Computations
June 2004 (vol. 15 no. 6)
pp. 546-558

Abstract—This paper is devoted to mapping iterative algorithms onto heterogeneous clusters. The application data is partitioned over the processors, which are arranged along a virtual ring. At each iteration, independent calculations are carried out in parallel, and some communications take place between consecutive processors in the ring. The question is to determine how to slice the application data into chunks, and to assign these chunks to the processors, so that the total execution time is minimized. One major difficulty is to embed a processor ring into a network that typically is not fully connected, so that some communication links have to be shared by several processor pairs. We establish a complexity result that assesses the difficulty of this problem, and we design a practical heuristic that provides efficient mapping, routing, link-sharing, and data distribution schemes.

[1] J. Barbosa, J. Tavares, and A.J. Padilha, Linear Algebra Algorithms in a Heterogeneous Cluster of Personal Computers Proc. Ninth Heterogeneous Computing Workshop, pp. 147-159, 2000.
[2] O. Beaumont, V. Boudet, A. Petitet, F. Rastello, and Y. Robert, A Proposal for a Heterogeneous Cluster ScaLAPACK (Dense Linear Solvers) IEEE Trans. Computers, vol. 50, no. 10, pp. 1052-1070, 2001.
[3] O. Beaumont, V. Boudet, F. Rastello, and Y. Robert, Matrix Multiplication on Heterogeneous Platforms IEEE Trans. Parallel and Distributed Systems, vol. 12, no. 10, pp. 1033-1051, Oct. 2001.
[4] F. Berman, High-Performance Schedulers The Grid: Blueprint for a New Computing Infrastructure, I. Foster and C. Kesselman, eds., pp. 279-309, Morgan-Kaufmann, 1999.
[5] D. Bertsekas and R. Gallager, Data Networks. Prentice Hall, 1987.
[6] V. Bharadwaj, D. Ghose, V. Mani, and T.G. Robertazzi, Scheduling Divisible Loads in Parallel and Distributed Systems, IEEE CS Press, 1996.
[7] V. Bharadwaj, D. Ghose, and T.G. Robertazzi, A New Paradigm for Load Scheduling in Distributed Systems Cluster Computing, vol. 6, no. 1, pp. 7-18, Jan. 2003.
[8] R.P. Brent, The LINPACK Benchmark on the AP1000: Preliminary Report Proc. CAP Workshop, 1991.
[9] R. Buyya, High Performance Cluster Computing. Volume 1: Architecture and Systems. Upper Saddle River, N.J.: Prentice Hall PTR, 1999.
[10] K.L. Calvert, M.B. Doar, and E.W. Zegura, “Modeling Internet Topology,” IEEE Comm. Magazine, vol. 35, no. 6, pp. 160-163, June 1997.
[11] M. Cierniak, M.J. Zaki, and W. Li, Compile-Time Scheduling Algorithms for Heterogeneous Network of Workstations The Computer J., vol. 40, no. 6, pp. 356-372, 1997.
[12] M. Cierniak, M.J. Zaki, and W. Li, Customized Dynamic Load Balancing for a Network of Workstations J. Parallel and Distributed Computing, vol. 43, pp. 156-162, 1997.
[13] T.H. Cormen, C.E. Leiserson, and R.L. Rivest, Introduction to Algorithms. MIT Press, 1990.
[14] P.E. Crandall and M.J. Quinn, “Block Data Decomposition for Data-Parallel Programming on a Heterogeneous Workstation Network,” Proc. Second Int'l Symp. High Performance Distributed Computing, pp. 42-49, 1993.
[15] E. Deelman and B.K. Szymanski, Dynamic Load Balancing in Parallel Discrete Event Simulation for Spatially Explicit Problems Proc. PADS'98 12th Workshop Parallel and Distributed Simulation, pp. 46-53, 1998.
[16] M. Doar, A Better Model for Generating Test Networks Proc. Globecom '96, Nov. 1996.
[17] A.B. Downey, Using Pathchar to Estimate Internet Link Characteristics Measurement and Modeling of Computer Systems, pp. 222-223, 1999.
[18] J.E. Flaherty, R.M. Loy, C. Özturan, M.S. Shephard, B.K. Szymanski, J.D. Teresco, and L.H. Ziantz, Parallel Structures and Dynamic Load Balancing for Adaptive Finite Element Computation Applied Numerical Math., vol. 26, nos. 1-2, pp. 241-263, 1997.
[19] J.E. Flaherty, R.M. Loy, M.S. Shephard, B.K. Szymanski, J.D. Teresco, and L.H. Ziantz, Adaptive Local Refinement with Octree Load Balancing for the Parallel Solution of Three-Dimensional Conservation Laws J. Parallel and Distributed Computing, vol. 47, no. 2, pp. 139-152, 1997.
[20] The Grid: Blueprint for a New Computing Infrastructure. I. Foster and C. Kesselman, eds., Morgan-Kaufmann, 1999.
[21] M.R. Garey and D.S. Johnson, Computers and Intractability, a Guide to the Theory of NP-Completeness. W.H. Freeman and Company, 1991.
[22] Y.F. Hu and R.J. Blake, Load Balancing for Unstructured Mesh Applications Parallel and Distributed Computing Practices, vol. 2, no. 3, 1999.
[23] S. Ichikawa and S. Yamashita, Static Load Balancing of Parallel PDE Solver for Distributed Computing Environment Proc. 13th Int'l Conf. Parallel and Distributed Computing Systems, pp. 399-405, 2000.
[24] M. Kaddoura, S. Ranka, and A. Wang, Array Decomposition for Nonuniform Computational Environments J. Parallel and Distributed Computing, vol. 36, pp. 91-105, 1996.
[25] A. Kalinov and A. Lastovetsky, Heterogeneous Distribution of Computations while Solving Linear Algebra Problems on Networks of Heterogeneous Computers Proc. Conf. High-Performance Computing and Networking (HPCN Europe), pp. 191-200, 1999.
[26] D. Katabi, M. Handley, and C. Rohrs, Congestion Control for High Bandwidth-Delay Product Networks Proc. ACM 2002 Conf. Applications, Technologies, Architectures, and Protocols for Computer Comm. (SIGCOMM), pp. 89-102, 2002.
[27] A. Legrand, H. Renard, Y. Robert, and F. Vivien, Load-Balancing Iterative Computations in Heterogeneous Clusters with Shared Communication Links Technical Report RR-2003-23, LIP, ENS Lyon, France, also available as INRIA Research Report 4800, Apr. 2003.
[28] M. Nibhanupudi and B. Szymanski, BSP-Based Adaptive Parallel Processing High Performance Cluster Computing. Volume 1: Architecture and Systems, R. Buyya, ed., pp. 702-721, Prentice-Hall, 1999.
[29] D. Nicol and P. Reynolds, “Optimal Dynamic Remapping of Data Parallel Computations,” IEEE Trans. Computers, vol. 39, no. 2, pp. 206-219, Feb. 1990.
[30] D.M. Nicol and J.H. Saltz, "Dynamic Remapping of Parallel Computations with Varying Resource Demands," IEEE Trans. Computers., vol. 37, no. 9, pp. 1,073-1,087, Sept. 1988.
[31] H. Renard, Y. Robert, and F. Vivien, Static Load-Balancing Techniques for Iterative Computations on Heterogeneous Clusters Proc. Euro-Par'03: Parallel Processing, pp. 148-159, 2003.
[32] B.A. Shirazi, A.R. Hurson, and K.M. Kavi, Scheduling and Load Balancing in Parallel and Distributed Systems. IEEE Computer Science Press, 1995.
[33] A.S. Tanenbaum, Computer Networks. Prentice Hall, 2003.
[34] A.G. Taylor and A.C. Hindmarsh, User Documentation for KINSOL, a Nonlinear Solver for Sequential and Parallel Computers Technical Report UCRL-ID-131185, Lawrence Livermore Nat'l Laboratory, July 1998.
[35] J. Watts and S. Taylor, “A Practical Approach to Dynamic Load Balancing,” IEEE Trans. Parallel and Distributed Systems, vol. 9, no. 3, pp. 235–248, Mar. 1998.
[36] M.-Y. Wu, On Runtime Parallel Scheduling for Processor Load Balancing IEEE Trans. Parallel and Distributed Systems, vol. 8, no. 2, pp. 173-186, 1997.

Index Terms:
Scheduling, load-balancing, iterative computations, heterogeneous clusters.
Arnaud Legrand, H?l?ne Renard, Yves Robert, Fr?d?ric Vivien, "Mapping and Load-Balancing Iterative Computations," IEEE Transactions on Parallel and Distributed Systems, vol. 15, no. 6, pp. 546-558, June 2004, doi:10.1109/TPDS.2004.10
Usage of this product signifies your acceptance of the Terms of Use.