Is it Time to Rename the GPU? Used for Far More than Graphics Now
I was reminiscing last week about the GPU and having a friendly debate with my pals at Nvidia about the origin of the acronym. They of course claim they invented it, the device and the acronym, and within certain qualifications they did.
The term was first used by Sony in 1994 with the launch of the PS1. That system had a 32-bit Sony GPU (designed by Toshiba). The acronym was used before and after that referring to a geometry processing unit—GPU. TriTech introduced the Geometry Processor Unit in 1996 and Microsoft licensed it from them in 1998. It was part of a multi-chip solution and used the OpenGL API.
3DLABS introduced the Glint multi-chip set in 1995 with a geometry processor unit (later integrated into one chip) which was targeted at the professional graphics workstation market, which at the time was the most demanding in terms of performance. Nvidia targeted their device at the gaming community which was smaller but growing rapidly. Five or six years later the gaming market took off, taking Nvidia with it, while the workstation market flattened out and didn’t provide enough sales for companies like 3DLABS to continue investing in new semiconductor designs. Soon Nvidia was able to adapt their device to the professional graphics market too, and increased the competitive pressure on dedicated graphics processor suppliers.
From 2000, the term GPU as applied to geometry processing unit has been frequently used and appears in dozens of patents.
During that same period of time, researchers at universities, always on the hunt for less expensive computing power and more of it began experimenting with using the processors in gaming consoles such as the Cell in the PS3, and the GPUs from ATI and Nvidia that were used in them.
Ironically, the way they chose to program the GPU for computing applications was through OpenGL because it exposed more of the GPU’s capabilities.
Today we find the GPU being used for artificial intelligence inferencing in mobile phones and automobiles, in AI training at various companies and government agencies, crypto-currency mining, scientific, medical, and engineering application acceleration, and robotics, to name a few of the most common application workloads. The GPU is reducing the time it takes researchers and engineers to analyze, design, and diagnose problems and challenges that in the past would have taken days to weeks, in some cases like protein-folding, maybe months. Not only are answers to complex and complicated problems being obtained sooner, they are being expanded in accuracy. One of the compromises made in data reductions is to reduce the accuracy to get an answer in one life-time.
But is it still a graphics processing engine? Clearly not. A case in point is the Nvidia Volta with its tensor engine and the Vega by AMD. Intel too will be entering the GPU market and bring its vast AI capabilities to one of their offerings.
It is a SoC, a parallel-processor with associative special function engines such as video codec, rasterizer, neural-net accelerator, and DSPs for audio and image-processing.
Today’s chips are massive devices. Nvidia’s Volta for example is the largest chip ever made and measures 815 mm2. Crammed into that 12nm die are 21.1 billion (with a B) transistors, with some of them (lots of) them are being applied to the 5,376, 32-bit floating-point cores configured in a SIMD architecture making it the biggest integrated parallel-processor ever built—and it’s likely to hold that title and claim for quite a while because as the feature size goes down, so has the yield making these giant chips harder to build an more expensive.
They are also prodigious consumers of data and demand the fast, tightly coupled memory with the highest bandwidth possible to feed all those 32-bit ALUs. To try and satisfy that demand, AMD and Nvidia have adopted 32 GBs of high-bandwidth 3D memory (HBM) stacks with 900 GB/s of peak memory bandwidth to their processors.
And it doesn’t stop there. If one of these monsters is good, shouldn’t two, or four, or 16 be even better? And the answer is yes, of course. The GPU is inherently capable of scaling but to do it you need a super-high-speed communications network, commonly referred to a fabric today. Intel has one, and so does Nvidia. AMD has one in their Epyc CPU for linking all those X86 Zen processors. Nvidia calls their chip-to-chip fabric NVlink and it moves data at up to 300 GB/s from one Volta GPU to another. AMD’s Infinity Fabric’s System Control Fabric (SCF) hits 41.4 GB/s within the chip.
All these techniques are modern day versions of the designs of parallel processors developed in the late 1980s and built in big racks. They are laughably slower than the SoCs in our smartphones today, but the communications schemes, and allocation of localized high-speed memory are the same—just a zillion times faster and larger in terms of ALUs. We owe all that Moore’s law and the amazing machines in those amazing fabs that make building 7nm silicon systems possible.
And as interesting and mind-bending as all that is it still leaves us with the need for a better name for these massive parallel processor SoCs. No doubt the clever marketing people at one supplier or another will coin a term, so I’m not going to offer one, but I can predict the word “accelerate” or “accelerator” will probably be in it, as will probably the term massive, or large (remember VLSI?). And there can be a lot of fun naming these monster chips.
But what about the folks who still want and need a good to great graphics accelerator? As exciting as AI and application acceleration is, the volume for these massive processors in those applications is a fraction of what is sold for gaming, photo and video editing, and professional graphics. For that very large population we will still need and appreciate the tried and tested GPU.
So just as a cell splits in two in the process of life, maybe it’s time for the GPU to split in two and spawn its new, bigger, more powerful sibling—the parallel-processing application accelerator unit—PPAAU. Oh damn! I named it.
Dr. Jon Peddie is one of the pioneers of the graphics industry and formed Jon Peddie Research (JPR) to provide customer intimate consulting and market forecasting services where he explores the developments in computer graphics technology to advance economic inclusion and improve resource efficiency.
Recently named one of the most influential analysts, Peddie regularly advises investors in the technology sector. He is an advisor to the U.N., several companies in the computer graphics industry, an advisor to the Siggraph Executive Committee, and in 2018 he was accepted as an ACM Distinguished Speaker. Peddie is a senior and lifetime member of IEEE, and a former chair of the IEEE Super Computer Committee, and the former president of The Siggraph Pioneers. In 2015 he was given the Life Time Achievement award from the CAAD society.
Peddie lectures at numerous conferences and universities world-wide on topics pertaining to graphics technology and the emerging trends in digital media technology, as well as appearing on CNN, TechTV, and Future Talk TV, and is frequently quoted in trade and business publications
Dr. Peddie has published hundreds of papers, has authored and contributed to no less than thirteen books in his career, his most recent, Augmented Reality, where we all will live, and is a contributor to TechWatch, for which he writes a series of weekly articles on AR, VR, AI, GPUs, and computer gaming. He is a regular contributor to IEEE, Computer Graphics World, and several other leading publications.