Gesturing Going Mainstream
by George Lawton
The computer interface could take another big leap with a variety of lower-cost techniques for tracking hand gestures. Researchers at the Massachusetts Institute of Technology (MIT) have developed software that can track a $2 Lycra glove with a special pattern on it using a cheap webcam. Other efforts are focusing on using more cameras and increased computer horsepower to recognize bare hands.
The successes of the Wii and iPhone demonstrated the importance of moving beyond traditional button, joystick, and mouse interfaces. Companies exploring ways to expand the possibilities include Microsoft, which plans to release the Kinect full-body-tracking interface for the Xbox game system in November. Kinect uses cameras to track large body movements with fidelity, but it lacks the precision to distinguish minute hand gestures, said MIT graduate student Robert Wang.
The idea of gesture-tracking gloves has been around for some time. The VPL DataGlove came to market in 1987, using novel fiber-optic sensors for tracking finger movements. The technology hasn't come down in price appreciably since it was introduced. A pair of gesture-tracking gloves starts at $1,300 and goes up to $40,000 for a higher end system with force feedback, said Lee Dickholtz, principal at Meta Motion Systems. Another approach that uses a multicamera-based system with special reflector tape worn on the fingers or gloves starts at about $6,000.
The high costs have constrained hand-gesture tracking to high-end applications such as engineering, computer animation, and science. Wang predicts that a lower-cost gesture-tracking interface could be useful in learning sign language, mastering the piano, manipulating 3D drawings, and as a general interface.
More Horsepower
The most obvious approach to improving the gestural interface is to add more cameras and computational horsepower.
For example, a team of researchers at Fraunhofer HHI in Germany have developed the iPoint, which is optimized for tracking the motion of a single finger. The iPoint's cameras and infrared lights are housed in a special tray about the size of a keyboard that connects to a computer via a FireWire port. It can track a fingertip in real time at 50 Hz with millimeter-precise 3D coordinates. It's currently used in a custom medical application that Karl Storz AG developed to reduce hygiene issues associated with physically touching a computer in an operating room.
In the US, Edge 3 Technologies is pursuing a computationally intense approach, developing algorithms to run on the latest generation of highly-parallelized GPUs for PCs. The GPUs can provide 100 times the processing power of a standard CPU.
"These algorithms are challenging to develop and are very different from sequential proof-of-concept type of algorithms," said Tarek El Dokor, CEO of Edge 3 Technologies. "However, once they're mastered, they enable third-party applications in gesture recognition that are truly amazing and at very reasonable prices, since they utilize off-the-shelf cameras."
Working Smarter
The MIT researchers turned the recognition problem on its head. Wang and MIT associate professor Jovan Popavic realized they could make the gesture-recognition algorithms more efficient by reducing gestures to 40 X 40 pixel images. They generate the unique patterns from the layout of a glove with specially placed color splotches.
Wang said the image is sufficiently descriptive and compact to capture hand gestures economically. "We can tell what pose the hand is in just from a tiny 40 X 40 pixel image, and the image is small enough that it doesn't take up very much space."
Other finger-modeling approaches are more computationally intensive. They must first perform feature detection to determine the finger locations and other kinds of analysis to compare their relative positions. In contrast, the color patches on the MIT gloves generate patterns that map directly to finger positions.
The color-glove interface can track a hand using a single $50 web camera with a wide-angle lens. It takes about 15 seconds to calibrate the first time so that it can learn the hand size and lighting. The system detects the 3D orientation and 3D position of the hands as well as the finger configuration. Wang said that it could enhance gaming systems, such as Microsoft Kinect, to support hand gestures.
The interface isn't yet up to the same accuracy as a mouse or even a touch screen, Wang said. But it lets users provide 3D input in a more natural way. The system is currently in private beta, and Wang expects it to be ready for more widespread use in a couple of months.
Don’t Throw Out the Mouse…Yet
There are challenges in moving "out of the gimmick realm and into mainstream adoption," noted El Dokor. "Many will oversell, few will deliver. For this to work, the expectations have to match reality."
More work is needed for the software stack to integrate with applications like Skype and gaming platforms like the Xbox. Furthermore, El Dokor said that developers will have to get comfortable working on high-performance applications that run at 60 frames per second with high resolution.
User acceptance also poses a challenge, noted Paul Chojecki, a research scientist at Fraunhofer HHI. "The lack of haptic technology really hurts. It’s hard to learn how to use a gesture-control system with bare hands in the air."
Despite the challenges, many proponents expect gesture-driven computing to thrive in the long run. "Gaming with Microsoft Kinect is just the beginning," said Chojecki. "But it will bring the idea of a contact-free interaction to living rooms all over the world." He sees the advantages in vandal-proofing, hygiene, and size driving the replacement of touch screens in many areas.
For more information on the MIT project, see http://people.csail.mit.edu/rywang/hand.
George Lawton is a freelance correspondent in Guerneville, California. You can contact him via his website http://www.glawton.com.