2008 11th IEEE International Conference on Computational Science and Engineering Application Performance Tuning for Clusters with ccNUMA Nodes July 16-July 18 ISBN: 978-0-7695-3193-9
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/CSE.2008.46
With the increasing trend of putting more cores inside a single chip, more clusters adapt multicore multiprocessor nodes for high-performance computing (HPC). Cache coherent non-uniform memory access architectures (ccNUMA) are becoming an increasingly popular choice for such systems. In this paper, application performance analysis is provided using a 2312 Opteron cores system based on Sun Fire servers. Performance bottlenecks are identified and some potential solutions are proposed. With the proposed performance tunings, up to 30% application performance improvement was observed. In addition, provided experimental analysis can be utilized by HPC application developers in order to better understand clusters with ccNUMA nodes and also as a guideline for the usage of such architectures for scientific computing.
Index Terms:
ccNUMA, application performance, cpu affinity, high-performance computing
Citation:
Abdullah Kayi, Edward Kornkven, Tarek El-Ghazawi, Greg Newby, "Application Performance Tuning for Clusters with ccNUMA Nodes," cse, pp.245-252, 2008 11th IEEE International Conference on Computational Science and Engineering, 2008 Usage of this product signifies your acceptance of the Terms of Use. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||