Enabling Efficient Inter-Node Message Passing and Remote Memory Access Via a uGNI Based Light-Weight Network Substrate for Cray Interconnects
2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID) (2018)
Washington, DC, USA
May 1, 2018 to May 4, 2018
Today's cutting-edge network hardware features extremely low latency and high bandwidth transactions for higher-level communication substrates. The Cray XC/XE family of network fabrics, also known as Cray Aries/Gemini respectively, supports such high-performance remote memory access operations (RMA) and a plethora of transaction modes to optimize communication via lower-level interfaces such as uGNI and DMAPP. However, enabling efficient one-sided communication for higher-level substrates is difficult due to barriers presented by the programming model itself, as well as miscellaneous synchronization bottlenecks at the runtime layers. We present an efficient programming model based on a distributed memory allocator for RMA and a communication substrate based on readers and writers for inter-node message passing and RMA operations. We try to maximize performance by introducing a scalable RMA event notification scheme and synchronization protocols that fully leverage Aries/Gemini fabric. Micro-benchmark results demonstrate that our library outperforms Cray MPI-3.0-based RMA one-sided operations by 1.5X and up to 6X in certain cases and is comparable or improves upon performance on others.
message passing, parallel programming, protocols
U. Wickramasinghe and A. Lumsdaine, "Enabling Efficient Inter-Node Message Passing and Remote Memory Access Via a uGNI Based Light-Weight Network Substrate for Cray Interconnects," 2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), Washington, DC, USA, 2018, pp. 578-588.