The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.02 - March/April (2010 vol.30)
pp: 56-69
John Nickolls , NVIDIA
ABSTRACT
<p>GPU computing is at a tipping point, becoming more widely used in demanding consumer applications and high-performance computing. This article describes the rapid evolution of GPU architectures&#x2014;from graphics processors to massively parallel many-core multiprocessors, recent developments in GPU computing architectures, and how the enthusiastic adoption of CPU&#x002B;GPU coprocessing is accelerating parallel applications.</p>
INDEX TERMS
GPU computing, CUDA, scalable parallel computing, heterogeneous CPU&#x002B;GPU coprocessing, Tesla GPU architecture, Fermi GPU architecture, NVIDIA.
CITATION
John Nickolls, William J. Dally, "The GPU Computing Era", IEEE Micro, vol.30, no. 2, pp. 56-69, March/April 2010, doi:10.1109/MM.2010.41
REFERENCES
1. J. Nickolls et al., "Scalable Parallel Programming with CUDA," ACM Queue, vol. 6, no. 2, 2008, pp. 40-53.
2. NVIDIA, NVIDIA CUDA Programming Guide, 2009; http://developer.download.nvidia.com/compute/ cuda/2_3/toolkit/docsNVIDIA_CUDA_Programming_Guide_2.3.pdf .
3. C. Boyd, "DirectCompute: Capturing the Teraflop," Microsoft Personal Developers Conf., 2009; http://ecn.channel9.msdn.com/o9/pdc09/ppt CL03.pptx.
4. Khronos, The OpenCL Specification, 2009; http://www.khronos.orgOpenCL.
5. J. Montrym and H. Moreton, "The GeForce 6800," IEEE Micro, vol. 25, no. 2, 2005, pp. 41-51.
6. W.R. Mark et al., "Cg: A System for Programming Graphics Hardware in a C-like Language," Proc. Special Interest Group on Computer Graphics (Siggraph), ACM Press, 2003, pp. 896-907.
7. E. Lindholm et al., "NVIDIA Tesla: A Unified Graphics and Computing Architecture," IEEE Micro, vol. 28, no. 2, 2008, pp. 39-55.
8. J. Nickolls and D. Kirk, "Graphics and Computing GPUs," Computer Organization and Design: The Hardware/Software Interface, D.A. Patterson, and J.L. Hennessy 4th ed., Morgan Kaufmann, 2009, pp. A2-A77.
9. NVIDIA, "Fermi: NVIDIA's Next Generation CUDA Compute Architecture," 2009; http://www.nvidia.com/content/PDF/fermi_white_papers NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf .
10. M., Garland et al., "Parallel Computing Experiences with CUDA," IEEE Micro, vol. 28, no. 4, 2008, pp. 13-27.
11. D.B. Kirk and W.W. Hwu, Programming Massively Parallel Processors: A Hands-on Approach, Morgan Kaufmann, 2010.
12. I.S. Ufimtsev and T.J. Martinez, "Quantum Chemistry on Graphical Processing Units. 1. Strategies for Two-Electron Integral Evaluation," J. Chemical Theory and Computation, vol. 4, no. 2, 2008, pp. 222-231.
13. M.S. Friedrichs et al., "Accelerating Molecular Dynamic Simulation on Graphics Processing Units," J. Computational Chemistry, vol. 30, no. 6, 2009, pp. 864-872.
14. J. Tölke and M. Krafczyk, "TeraFLOP Computing on a Desktop PC with GPUs for 3D CFD," Int'l J. Computational Fluid Dynamics, vol. 22, no. 7, 2008, pp. 443-456.
15. T. Brandvik and G. Pullan, "Acceleration of a 3D Euler Solver Using Commodity Graphics Hardware," Proc. 46th Am. Inst. of Aeronautics and Astronautics (AIAA) Aerospace Sciences Meeting, AIAA, 2008; http://www.aiaa.orgagenda.cfm?lumeetingid= 1065&dateget=08-Jan-08#session8907 .
16. M.A. Clark et al., "Solving Lattice QCD Systems of Equations Using Mixed Precision Solvers on GPUs," Computer Physics Comm., 2009; http://arxiv.org/abs0911.3191v2.
17. D. Göddeke and R. Strzodka, "Performance and Accuracy of Hardware-Oriented Native-, Emulated-, and Mixed-Precision Solvers in FEM Simulations (Part 2: Double Precision GPUs)," Ergebnisberichte des Instituts für Angewandte Mathematik [Reports on Findings of the Inst. for Applied Mathematics], Dortmund Univ. of Technology, no. 370, 2008; http://www.mathematik.uni-dortmund.de/~goeddeke/ pubsGTX280_mixedprecision.pdf.
18. R.G. Belleman, J. Bedorf, and S.P. Zwart, "High Performance Direct Gravitational N-body Simulations on Graphics Processing Units II: An Implementation in CUDA," New Astronomy, vol. 13, no. 2, 2008, pp. 103-112.
19. Y. Liu, B. Schmidt, and D.L. Maskell, "MSA-CUDA: Multiple Sequence Alignment on Graphics Processing Units with CUDA," Proc. 20th IEEE Int'l Conf. Application-Specific Systems, Architectures and Processors, IEEE CS Press, 2009, pp. 121-128.
20. B. Catanzaro et al., "Efficient, High-Quality Image Contour Detection," Proc. IEEE Int'l Conf. Computer Vision, IEEE CS Press, 2009; http://www.cs.berkeley.edu/~catanzar/Damascene iccv2009.pdf.
21. J. Chong et al., "Data-Parallel Large Vocabulary Continuous Speech Recognition on Graphics Processors," tech. report UCB/EECS-2008-69, Univ. of California at Berkeley, 2008; http://www.eecs.berkeley.edu/Pubs/TechRpts/ 2008EECS-2008-69.pdf.
22. Y. Pan et al., "Feasibility of GPU-Assisted Iterative Image Reconstruction for Mobile C-Arm CT," Proc. Int'l Soc. for Photonics and Optonics (SPIE), vol. 7258, SPIE 2009;http://www.sci.utah.edu/~ypanPan_ SPIE2009.pdf .
23. J.H. Huang, "2009: The GPU Computing Tipping Point," Proc. IEEE Hot Chips 21, 2009; http://www.hotchips.org/archiveshc21.
8 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool