Parallel and Distributed Processing Symposium, International (2008)
Miami, FL, USA
Apr. 14, 2008 to Apr. 18, 2008
Nikitas J. Dimopoulos , Department of Electrical and Computer Engineering, University of Victoria, B.C., Canada
Farshad Khunjush , Department of Electrical and Computer Engineering, University of Victoria, B.C., Canada
The main contributors to message delivery latency in message passing environments are the copying operations needed to transfer and bind a received message to the consuming process/thread. A significant portion of the software communication overhead is attributed to message copying. Recently, a set of factors has been leading high-performance processor architectures toward designs that feature multiple processing cores on a single chip (a.k.a. CMP). The Cell Broadband Engine (BE) shows potential to provide high-performance to parallel applications (e.g., MPI applications). The Cell’s non-homogeneous architecture along with small local storage in SPEs impose restrictions and challenges for parallel applications. In this work, we first characterize various data delivery mechanisms in the Cell BE processor; then, we propose techniques to facilitate the delivery of a message in MPI environments implemented in the Cell BE processor. We envision a cluster system comprising several cell processors each supporting several computation threads.
Nikitas J. Dimopoulos, Farshad Khunjush, "Extended characterization of DMA transfers on the Cell BE processor", Parallel and Distributed Processing Symposium, International, vol. 00, no. , pp. 1-8, 2008, doi:10.1109/IPDPS.2008.4536190