|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
2009 IEEE International Symposium on Parallel&Distributed Processing
Parallel data-locality aware stencil computations on modern micro-architectures
Rome, Italy
May 23-May 29
ISBN: 978-1-4244-3751-1
| ASCII Text | x | ||
| Matthias Christen, Olaf Schenk, Esra Neufeld, Peter Messmer, Helmar Burkhart, "Parallel data-locality aware stencil computations on modern micro-architectures," Parallel and Distributed Processing Symposium, International, pp. 1-10, 2009 IEEE International Symposium on Parallel&Distributed Processing, 2009. | |||
| BibTex | x | ||
| @article{ 10.1109/IPDPS.2009.5161031, author = {Matthias Christen and Olaf Schenk and Esra Neufeld and Peter Messmer and Helmar Burkhart}, title = {Parallel data-locality aware stencil computations on modern micro-architectures}, journal ={Parallel and Distributed Processing Symposium, International}, volume = {0}, year = {2009}, isbn = {978-1-4244-3751-1}, pages = {1-10}, doi = {http://doi.ieeecomputersociety.org/10.1109/IPDPS.2009.5161031}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - CONF JO - Parallel and Distributed Processing Symposium, International TI - Parallel data-locality aware stencil computations on modern micro-architectures SN - 978-1-4244-3751-1 SP1 EP10 A1 - Matthias Christen, A1 - Olaf Schenk, A1 - Esra Neufeld, A1 - Peter Messmer, A1 - Helmar Burkhart, PY - 2009 VL - 0 JA - Parallel and Distributed Processing Symposium, International ER - | |||
Novel micro-architectures including the Cell Broadband Engine Architecture and graphics processing units are attractive platforms for compute-intensive simulations. This paper focuses on stencil computations arising in the context of a biomedical simulation and presents performance benchmarks on both the Cell BE and GPUs and contrasts them with a benchmark on a traditional CPU system. Due to the low arithmetic intensity of stencil computations, typically only a fraction of the peak performance of the compute hardware is reached. An algorithm is presented, which reduces the bandwidth requirements and thereby improves performance by exploiting temporal locality of the data. We report on performance improvements over CPU implementations.
Citation:
Matthias Christen, Olaf Schenk, Esra Neufeld, Peter Messmer, Helmar Burkhart, "Parallel data-locality aware stencil computations on modern micro-architectures," ipdps, pp.1-10, 2009 IEEE International Symposium on Parallel&Distributed Processing, 2009
Usage of this product signifies your acceptance of the Terms of Use.
