This Article 
 Bibliographic References 
 Add to: 
Improving Load Balance with Flexibly Assignable Tasks
October 2005 (vol. 16 no. 10)
pp. 956-965
Ali Pinar, IEEE Computer Society

Abstract—In many applications of parallel computing, distribution of the data unambiguously implies distribution of work among processors. But, there are exceptions where some tasks can be assigned to one of several processors without altering the total volume of communication. In this paper, we study the problem of exploiting this flexibility in assignment of tasks to improve load balance. We first model the problem in terms of network flow and use combinatorial techniques for its solution. Our parametric search algorithms use maximum flow algorithms for probing on a candidate optimal solution value. We describe two algorithms to solve the assignment problem with \log W_T and |P| probe calls, where W_T and |P|, respectively, denote the total workload and number of processors. We also define augmenting paths and cuts for this problem, and show that any algorithm based on augmenting paths can be used to find an optimal solution for the task assignment problem. We then consider a continuous version of the problem and formulate it as a linearly constrained optimization problem, i.e., \min \|Ax\|_\infty, {\rm {s.t.}} Bx=d. To avoid solving an intractable \infty{\hbox{-}}{\rm{norm}} optimization problem, we show that, in this case, minimizing the 2{\hbox{-}}{\rm{norm}} is sufficient to minimize the \infty{\hbox{-}}{\rm{norm}}, which reduces the problem to the well-studied linearly constrained least squares problem. The continuous version of the problem has the advantage of being easily amenable to parallelization. Our experiments with molecular dynamics and overlapped domain decomposition applications proved the effectiveness of our methods with significant improvements in load balance. We also discuss how our techniques can be extended to heterogeneous parallel computers.

[1] L. Kalé, M. Bhandarkar, and R. Brunner, “Load Balancing in Parallel Molecular Dynamics,” Proc. Fifth Int'l Symp. Solving Irregularly Structured Problems in Parallel, 1998.
[2] S. Plimpton, “Fast Parallel Algorithms for Short-Range Molecular Dynamics,” J. Computational Physics, vol. 117, pp. 1-19, 1995.
[3] A. Pinar and B. Hendrickson, “Partitioning for Complex Objectives,” Proc. Int'l Parallel and Distributed Processing Symp., 2001.
[4] B. Smith, P. Bjørstad, and W. Gropp, Domain Decomposition: Parallel Multilevel Methods for Elliptic Partial Differential Equations. Cambridge Univ. Press, 1996.
[5] G. Cybenko, “Dynamic Load Balancing for Distributed Memory Multiprocessors,” J. Parallel Distributed Computing, vol. 7, pp. 279-301, 1989.
[6] B. Hendrickson and T. Kolda, “Graph Partitioning Models for Parallel Computing,” Parallel Computing, vol. 26, pp. 1519-1534, 2000.
[7] R.L. Carino and I. Banicescu, “Load Balancing Parallel Loops on Message-Passing Systems,” Proc. 14th IASTED Int'l Conf. Parallel and Distributed Computing and Systems, pp. 362-367, 2002.
[8] T.H. Cormen, C.E. Leiserson, and R.L. Rivest, Introduction to Algorithms. Cambridge, Mass.: MIT Press and McGraw-Hill, 1990.
[9] R.E. Tarjan, Data Structures and Network Algorithms. SIAM, 1983.
[10] A.V. Goldberg and S. Rao, “Beyond the Flow Decomposition Barrier,” J. ACM, vol. 45, pp. 783-797, 1998.
[11] V. Ramachandran, “The Complexity of Minimum Cut and Maximum Flow Problems in an Acyclic Network,” Networks, vol. 17, pp. 387-392, 1987.
[12] R. Diekmann, A. Frommer, and B. Monien, “Efficient Schemes for Nearest Neighbor Load Balancing,” Parallel Computing, pp. 789-812, 1999.
[13] R. Elsässer, B. Monien, and R. Preis, “Diffusive Load Balancing Schemes on Heterogeneous Networks,” Proc. 12th ACM Symp. Parallel Algorithms Architecture (SPAA), pp. 30-38, 2000.
[14] Å. Björck, Numerical Methods for Least Squares Problems. SIAM, 1996.
[15] C. Cryer, “The Solution of a Quadratic Programming Problem Using Systematic Overrelaxation,” SIAM J. Control and Optimization, vol. 9, pp. 385-392, 1971.
[16] A. Dax, “Bounded Least Squares Problem,” ACM Trans. Math. Software, 1991.
[17] L. Silbert, D. Ertas, G. Grest, T. Halsey, D. Levine, and S.J. Plimpton, “Granular Flow Down an Inclined Plane: Bagnold Scaling and Rheology,” Physica Rev. E, vol. 64, p. 51302, 2001.

Index Terms:
Parallel computing, load balancing, flexibly assignable tasks, maximum flow, constrained least squares.
Ali Pinar, Bruce Hendrickson, "Improving Load Balance with Flexibly Assignable Tasks," IEEE Transactions on Parallel and Distributed Systems, vol. 16, no. 10, pp. 956-965, Oct. 2005, doi:10.1109/TPDS.2005.123
Usage of this product signifies your acceptance of the Terms of Use.