|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
| ASCII Text | x | ||
| Jeremy S. Meredith, Philip C. Roth, Kyle L. Spafford, Jeffrey S. Vetter, "Performance Implications of Nonuniform Device Topologies in Scalable Heterogeneous Architectures," IEEE Micro, vol. 31, no. 5, pp. 66-75, September/October, 2011. | |||
| BibTex | x | ||
| @article{ 10.1109/MM.2011.79, author = {Jeremy S. Meredith and Philip C. Roth and Kyle L. Spafford and Jeffrey S. Vetter}, title = {Performance Implications of Nonuniform Device Topologies in Scalable Heterogeneous Architectures}, journal ={IEEE Micro}, volume = {31}, number = {5}, issn = {0272-1732}, year = {2011}, pages = {66-75}, doi = {http://doi.ieeecomputersociety.org/10.1109/MM.2011.79}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - MGZN JO - IEEE Micro TI - Performance Implications of Nonuniform Device Topologies in Scalable Heterogeneous Architectures IS - 5 SN - 0272-1732 SP66 EP75 EPD - 66-75 A1 - Jeremy S. Meredith, A1 - Philip C. Roth, A1 - Kyle L. Spafford, A1 - Jeffrey S. Vetter, PY - 2011 KW - GPU KW - nonuniformity KW - heterogeneous GPUs KW - data-transfer performance VL - 31 JA - IEEE Micro ER - | |||
This article considers trends in heterogeneous system design, particularly for GPUs. Using the Keeneland Initial Delivery System, the authors examine the performance implications of increased parallelism and specialized hardware on parallel scientific applications. They examine how nonuniform data-transfer performance across the node-level topology can impact performance. Finally, they help users of GPU-based systems avoid performance problems related to this nonuniformity.
1. J.D. Owens et al., "A Survey of General-Purpose Computation on Graphics Hardware," Proc. Eurographics State of the Art Reports, European Assoc. Computer Graphics, 2004, pp. 21-51.
2. M. Pharr and R. Fernando, GPU Gems 2: Programming Techniques for High-Performance Graphics and General-Purpose Computation, Addison-Wesley, 2005.
3. D. Grice, "The Roadrunner Project and the Importance of Energy Efficiency on the Road to Exascale Computing," Proc. 23rd Int'l Conf. Supercomputing (ICS 09), ACM Press, 2009, doi:10.1145/1542275.1542279.
4. J. Vetter et al., "Keeneland: Bringing Heterogeneous Computing Using Graphics Processors to the NSF Computational Science Community," IEEE Computing Science and Eng., vol. 13, no. 5, 2011, pp. 90-95.
5. B. Hess et al., "GROMACS 4: Algorithms for Highly Efficient, Load-Balanced, and Scalable Molecular Simulation," J. Chemical Theory and Computation, vol. 4, no. 3, 2008, pp. 435-447.
6. J.C. Phillips et al., "Scalable Molecular Dynamics with NAMD," J. Computational Chemistry, vol. 26, no. 16, 2005, pp. 1781-1802.
7. K. Spafford, J.S. Meredith, and J.S. Vetter, "Quantifying NUMA and Contention Effects in Multi-GPU Systems," Proc. 4th Workshop General-purpose Processing on Graphics Processing Units, ACM Press, 2011, doi:10.1145/1964179.1964194.
8. A. Danalis et al., "The Scalable Heterogeneous Computing (SHOC) Benchmark Suite," Proc. 3rd Workshop General-purpose Computation on Graphics Processing Units (GPGPU 10), ACM Press, 2010, pp. 63-74.
9. S. Plimpton, "Fast Parallel Algorithms for Short-Range Molecular Dynamics," J. Computational Physics, vol. 117, no. 1, 1995, pp. 1-19.
10. T.A. Maier, M.S. Jarrell, and D.J. Scalapino, "Structure of the Pairing Interaction in the Two-Dimensional Hubbard Model," Physical Review Letters, vol. 96, no. 4, 2006, pp. 047005-047008.
11. J.S. Meredith et al., "Accuracy and Performance of Graphics Processors: A Quantum Monte Carlo Application Case Study," Parallel Computing, vol. 35, no. 3, 2009, pp. 151-163.
1. N. Brookwood, "AMD Fusion Family of APUs: Enabling a Superior, Immersive PC Experience," white paper, Advanced Micro Devices, Mar. 2010.
2. K. Skaugen, "Petascale to Exascale: Extending Intel's HPC Commitment," Proc. Int'l Supercomputing Conf. (ISC 10), 2010, keynote presentation, http://download.intel.com/pressroom/archive/ referenceISC_2010_Skaugen_keynote.pdf .
3. "The Convey HC-1 Computer: Architecture Overview," white paper, Convey Computer, Nov. 2008.

