Parallel and Distributed Processing Symposium, International (2001)
San Francisco, California, USA
Apr. 23, 2001 to Apr. 27, 2001
We present a general deterministic scheme to implement a shared memory abstraction on any distributed-memory machine which exhibits a clustered structure. More specifically, we develop a memory distribution strategy and an access protocol for the Decomposable BSP (D-BSP), a generic machine model whose bandwidth/latency parameters can be instantiated to closely reflect the characteristics of machines that admit a hierarchical decomposition into independent clusters. Our scheme achieves provably optimal slowdown for those machines where delays due to latency dominate over those due to bandwidth limitations. For machines where this is not the case, the slowdown is a mere logarithmic factor away from the natural bandwidth-based lower bound. An important feature of the scheme is that it can be made fully constructive for small memory sizes, while for larger sizes it relies solely on nonconstructive graphs of weak expansion.
C. Fantozzi, A. Pietracaptrina and G. Pucci, "Implementing Shared Memory on Clustered Machines* (Extended Abstract)," Parallel and Distributed Processing Symposium, International(IPDPS), San Francisco, California, USA, 2001, pp. 10062b.