In The News
July/August 2009 (Vol. 24, No. 4) pp. 5-9
1541-1672/09/$31.00 © 2009 IEEE

Published by the IEEE Computer Society
In The News
  Article Contents  
  GPU-Enabled AI  
  A UI You Can Cozy Up To  
  The Software Gap  
  GPU Computing in the Driver's Seat  
  Slithering Algorithms  
Download Citation
   
Download Content
 
PDFs Require Adobe Acrobat
 
GPU-Enabled AI
Mark Ingebretsen

Marching froglike creatures might not be the most obvious way to herald a new era in computing. But take a closer look, and you'll see that the affable froblins (short for frogs and goblins; see Figure 1 ; www.youtube.com/watch?v=FUtzOqgsLyE) cavorting on your computer screen have been




Figure 1. Froblins. AMD developed Froblins (short for frogs and goblins) to demonstrate the potential of GPU-enabled AI.



rendered in HD, and each possesses enough AI to permit them to function independently. In fact, the froblins' rendering and AI take place in real time on one of Advanced Micro Devices' (AMD's) ATI Radeon HD 4870 graphics cards. AMD has been investigating incorporating many of these AI-capable graphics cards into a cloud computing environment. As a result, froblins, along with any other creatures game developers might dream up, could become active characters in a massively multiplayer game that thousands of users could access at once.
To their developers at AMD, froblins represent a new potential not just for online gaming but for a plethora of other AI applications that could be possible thanks to GPU (graphics processing unit) based cloud computing. "We think AI in the cloud makes a lot of sense," says David Nalasco, technical marketing manager at AMD. As he explains it, the cloud provides a readily scalable environment for the enormous processing power made available by parallelized graphics chips.
GPU-enabled AI is a subset of so-called general-purpose GPU computing (GPGPU). But it promises to be one of the fastest-growing subsets. The rise of cloud computing, recent high-powered graphics-chip releases by AMD's competitor Nvidia, and the growing acceptance of the OpenCL programming platform have all converged to allow GPU-enabled AI to take off in the months ahead.
AMD continues to work with partners such as OTOY, a developer of high-speed rendering technologies, on cloud computing initiatives such as its so-called Fusion Render Cloud, as Intelligent Systems' sister publication, IEEE Spectrum, noted last March. OTOY plans to rent space on FRC to online game publishers, who will be able to stream their latest high-res AI-endowed characters at speeds of up to 50 frames per second, depending on the bandwidth of each user's connection. Some critics say latency issues—especially crucial in the fast world of gaming—will undermine the plan. But Nalasco says that as with other cloud applications such as streaming video, those issues can be overcome by distributing the processing close to users.
A UI You Can Cozy Up To
If Nalasco is correct, then online gaming, as Spectrum noted, could well prove to be GPU computing's killer app. But both AMD and Nvidia also foresee a day not too far off when GPU-enabled AI located either in the cloud or on stand-alone PCs will form an integral part of the desktop as well. Officials at both companies believe the GPU's massive processing power will yield useful AI applications such as the ability to interpret a user's hand gestures and facial expressions. Still other AI applications will perform power searches for photos on the Web or translate live conversations verbally. Put simply, GPU processing could finally bring about the kind of intelligent user interface techies first saw on Star Trek and have been dreaming about ever since.
"We fantasize all the time about what a user interface would be like if it were essentially aware," says Tony Tamasi, senior vice president of content and technology at Nvidia. Last fall, the Santa Clara graphics-processor maker brought the reality of self-aware user interfaces a tantalizing step closer when it unveiled its desktop Tesla GPU, which contains 240 cores and delivers 1 teraflop of performance. As many as four GPUs can be incorporated into a single machine, delivering 4 teraflops of performance, more than 250 times faster than typical workstations. With a price tag under $10,000, a Tesla "personal supercomputer," as they are categorized, compares favorably to supercomputers costing $1 million or more. Tesla has been initially marketed to universities and research firms, which would use it to perform highly complex proteomic and other computing-intensive research. It is also seeing rapid adoption in industries such as oil and gas and computational finance where the ability to compute large data sets in short spaces of time delivers considerable competitive advantages. But there may come a day when teraflop performance becomes standard on consumer desktops and laptops as well.
If that happens, Tamasi and others foresee an explosion of AI applications that enhance our work lives. "Many aspects of what people term AI are incredibly parallelizable and are very well suited to our style of processing," he says. "Decision tree is the only portion of AI that does not have immediately obvious parallelizable opportunities." Tamasi says companies are already hard at work creating what he calls sensory-oriented applications, such as pattern matching and image recognition.
The Software Gap
How quickly such applications become mainstream is anyone's guess. As has occurred throughout the history of computing, a lag exists between hardware advances and the software able to exploit them. Add to that lag an indeterminate amount of time before consumers embrace the new software.
GPU computing provides an extreme example. It was little more than a decade ago that researchers and hard-core geeks first realized that the low-cost graphics chips mass-produced for video game consoles held a veritable gold mine of processing power for scientific applications. The same chips able to throw out billions of polygons and shade game backgrounds with ever-increasing speed could also model how drugs interacted with cells or predict the path of storms. Early experimenters with GPU computing literally strung together a half dozen or more game consoles and wrote programs that broke up the complex problems they hoped to solved into millions of finite tasks.
Soon, researchers began programming in higher level AI functionality. "Graphical processing units are very simple. But collectively they can generate very complex behaviors," explains Xiaohui Cui, an AI researcher at the US Department of Energy's Oak Ridge National Laboratory.
Cui uses GPU computing for a number of projects, including data mining for scientific documents. And he's looked into using GPU computing to study so-called emergent behavior, which is "how group behavior can be generated by the actions of individuals." The field, which has practical implications in everything from social networking to antiterrorism and pandemic-disease containment, requires high-powered agent-based simulation, something that's only practical via parallelized computing. "To look at how millions of individual entities will collectively generate behavior, you need to do a simulation," he says.
In its earliest years, GPU-enabled AI computing was hindered by the relatively small numbers of programmers who could adapt their scripts for parallel processing and the lack of general-enough APIs that could be adopted by programmers other than those in the graphics domain. Things began changing in early 2007 when Nvidia unveiled the software development kit for its CUDA (Compute Unified Device Architecture) language. A little over a year later, AMD countered with Brook+. Both languages are loosely based on C. While CUDA adds a few extensions to C, Brook+ is an implementation of the Brook GPU specification on AMD's chipsets. The original Brook is a streaming programming language developed by Stanford University.
While CUDA has a wider following and a more robust developer community, some see shortcomings in the language. "Double precision is difficult for CUDA. CUDA is very fast using single precision but on double precision it's very slow," says Cui. Double precision refers to a language's ability to provide double the number of bits per operation as single precision, which is an important advantage in the scientific realm of high-performance computing.
Mark Gordon, the director of the Applied Mathematics and Computational Sciences program at another DOE facility, the Ames Laboratory, is developing new codes that would enable his Nvidia-supplied processors to use single-precision floating-point format where possible. "There are times when double precision is still necessary," he says.
Much of Gordon's work in quantum chemistry involves a huge number of variables. The enormous volume of data necessitates that his algorithms employ AI functionality to write code on the fly. Gordon, who uses CUDA, will also write GPU-enabling code directly in C.
Meanwhile, AMD has recently distanced itself from Brook+, in favor of OpenCL (Open Computer Language) and Microsoft's Direct3D 11 Compute Shader. "OpenCL is a programming framework that allows programmers to write once for any number of multicore-powered devices," Cui explains.
Nalasco says the OpenCL standard meshes well with his company's belief that AI applications will utilize both the CPU and the GPU. He says efforts are underway to also write code in OpenCL for the CPU. Meanwhile, Nvidia, while still firmly backing CUDA, released its OpenCL SDK for developers this April.
GPU Computing in the Driver's Seat
Now that developers have a common environment to work within, GPU-enabled consumer applications might accelerate dramatically in the months ahead. Tamasi, for example, foresees many applications being developed for cars.
"All the major automakers are looking into this," he says.
At an Nvidia-sponsored conference on GPU computing held last year, a paper presented by BMW described how in-dash supercomputers could allow drivers to speak to their cars, while the computer interface could read the drivers' facial expressions and gauge their level of alertness. Programs also could examine vast amounts of sensory data and deduce that a crash is imminent, then take the appropriate preventive steps. Ultimately, such supercomputers might even take over driving entirely. Tomasi won't disclose just how quickly we'll see futuristic applications such as these.
"Let's just say objects in the mirror are closer than they appear," he says.
Slithering Algorithms
Mark Ingebretsen

Robots come in all shapes and sizes. But few look as original as the menagerie of snake-like devices called HyDRAS (see Figure 2 ) designed by Dennis Hong, an associate professor of mechanical engineering at Virginia Tech, and his student colleagues. The name HyDRAS stands for




Figure 2. Hyper-Redundant Discrete Robotic Articulated Serpentine (HyDRAS) for climbing. Dennis Hong's HyDRAS employ AI algorithms to optimize their climbing motions. The robots are designed to inspect areas deemed dangerous to humans.



Hyper-Redundant Discrete Robotic Articulated Serpentine for climbing. Depending on the model, each of the roughly three-foot-long serpent bots are powered either by electric motors or compressed air.
The HyDRAS's ability to slither up poles of varying widths and materials make them perfect for inspecting bridge trusses, cell phone towers, and other places that are difficult and dangerous for humans to access. A tether hooked to a standard laptop will give construction crews a snake's-eye view of what's on top.
Intelligent Systems caught up with Hong in June, just before his trip to Austria where he was leading Team VT_DARwIn, the US entry in the worldwide RoboCup ( www.robocup.org) autonomous robotic soccer match. In the interview, he revealed that despite the HyDRAS's resemblance to many creepy-crawling earth creatures, the bots' climbing motions don't exist in nature. The devices wrap themselves around a structure then twist their entire bodies to produce a rolling motion that permits them to climb up or down a pole. That motion is a departure from the body-contorting inchworm gait other researchers have used as a model for their serpentine bots, Hong noted. In fact, the artificial gait results from AI algorithms he and his students devised.
As Hong explained it, the current generation of algorithms is "based on two disciplines, kinematics for motion generation and compliance for force control." The algorithms allow the bots to adapt to limited changes in the shape and size of the structures they climb. But Hong and his team are already at work on algorithms that will utilize "both a reactive behavior using local touch sensors in the skin and a deliberate behavior for carefully planning each motion as part of a larger plan" the device would use to maneuver. "In order to provide the necessary data, future HyDRAS will come equipped with the necessary sensors to implement this," he said.
And while the HyDRAS's current algorithms can't learn on their own, they can record the motion and infer the shape and dimension of the structure they climbed for future reference. This allows a human programmer to manually optimize the software to adapt to each significantly different type of structure the bots slither up.
Hong said lots more research and development is needed before his HyDRAS can be commercialized at a viable price. And he has no plans to launch a serpent bot company at present.
"Our goal is to do R&D for creating new useful robots to benefit the society," he said.
Indeed, assuming they were widely deployed, Hong's slithering HyDRAS could end up saving thousands of lives. A 2006 report from the US Bureau of Labor found that of the 1,226 construction workers who were killed on the job that year, some 809 of those deaths resulted from falls.