Load balancing on parallel computers aims at equilibrating some initial load which is initially different from one processor to another. Nearest neighbour load balancing algorithms can be divided basically into two classes: diffusion and dimension exchange. Whereas the first is appropriate for the so-called all-port-model where a processor can send tokens to all its neighbours at a time, the latter relies on the one-port-model.
In the last few years finite diffusion algorithms for general graphs as well as for product graphs like grids and tori have been developed. Recently finite dimension exchange algorithms have been proposed by the author. In the present paper we will introduce one new diffusion and two new dimension-exchange schemes for product graphs. We will show that the latter two can achieve nearly minimal communication operations and therefore short run-times. Additionally some modifications of the algorithms will be presented that reduce the flows so that only very few load items are moved via longer paths than necessary.