Pages: pp. 96-97
This is the second of three Advanced Graphics Technology departments CG&A will publish this year. We're alternating this department with Tools and Products.
Comments and criticisms on this new approach are most welcome. Drop an e-mail to firstname.lastname@example.org.
Each of the first five items combines different technologies—computer vision, artificial intelligence, and mathematics—to perform image recognition.
Determining quality in any manufactured product is truly difficult. This is because the definition of "quality" varies but also because samples can vary. An entire field, nondestructive testing, is devoted to assessing quality.
Carnegie Mellon University researchers accepted a different challenge: How do you assess a starter plant's quality? The CMU group chose the strawberry because all plants require yearly replacement. They developed a plant-sorting system that automatically assesses multiple grades of plant quality. Machine-learning techniques (likely, neural nets) gradually improve the assessment results. The ultimate test for such technology is determining the gender of baby chicks. For more information, visit www.rec.ri.cmu.edu/projects/strawberry.
Adrian Evans and Adrian Moorhouse at the University of Bath are investigating reliable identification of a person from just a nose. Machine vision techniques reconstruct the nose from a series of four images from different camera angles. The nose's key characteristics (the saddle width, ridge length, and nose tip width) are computed and then matched to a database containing people's nose prints. The algorithms for identifying a person's nose characteristics and the match itself are faster than facial recognition and potentially useful in quickly scanning crowds. Who knows, perhaps plastic surgeons might see a significant upswing in business as a result of this new technology. For more information, visit www.bath.ac.uk/news/2010/03/02/nose-recognition.
Figure New software analyzes shadows and works out coordinates for each point on the face.
The previous two pieces of research work well because they focus on specific types of objects: strawberry plants and noses. A problem with image recognition since its inception is the reliable identification of arbitrary objects in photographs. Photo interpreters in the intelligence community, airport security screeners, and many others could benefit from improved object recognition techniques that don't require extensive training.
MIT researchers have developed an object recognition system that requires no training. Nonetheless, the system identifies objects at least as well as the best prior algorithm. The idea arose from frame-to-frame coherence as a method to improve video compression. In this case, the system compares multiple still frames to each other for similarity. The more similar the frames are to one another, the better the results. A huge database of labeled images, which is continually updated through labelme.csail.mit.edu, provides basic labels for many images. For more information, visit http://groups.csail.mit.edu/vision/TinyImages.
Reconstructing full 3D models from one or more images has been a key computer graphics problem since Larry Roberts' 1963 PhD thesis ("Machine Perception of Three-Dimensional Solids"). The problem still isn't solved using a single image. Extraordinary advances have been made using image-based modeling, scanning and digitizing (lidar and coordinate-measuring machines), and 3D camera-based approaches. Cost has always been a factor.
The University of Cambridge has created a program to build 3D models of textured objects in real time. The budget-conscious should be pleased because the software needs only commodity PCs and a webcam. In this approach, a person rotates a physical object in front of the webcam. The software dynamically approximates the Delauny tetrahedral mesh containing points identified from the webcam video feed. Quick postprocessing removes invalid tetrahedra to obtain the surface mesh (using a probabilistic carving algorithm). The postprocessing concludes by applying the web-cam textures. The scan does a remarkable job of filtering jitter from the human hand rotating the object. For more information, visit http://mi.eng.cam.ac.uk/~qp202/my_papers/BMVC09.
Students at the University of Missouri at Columbia and the Missouri University of Science and Technology have developed a robotic platform incorporating an infrared camera and a lidar scanner. The camera lets operators remotely pilot the robot in a manner similar to unmanned aerial vehicles. The research team added lidar to do a precise 3D scan of the robot's environment. Because of the scan's accuracy, operators can use the robot in unsafe or unpredictable situations (such as buildings damaged by an earthquake or military operations). The robot's basic components cost US$25,000. Smaller, lighter, and invisible (thanks to Harry Potter) versions are next. For more information, visit www.engadget.com/2010/02/23/lidar-equipped-robot-maps-dangerous-areas-in-3d-so-you-dont-hav.
Figure This robotic platform can do a precise 3D scan of an environment.
The next two items discuss methods people use to provide input to complex graphics systems.
Touch screen aficionados using small-screen devices have five choices per hand to provide input to their favorite application. Three-axis accelerometers capture roll, pitch, and yaw, and GPS captures location.
Carleton University researchers are investigating touch screens for large-screen (wall- or table-sized) devices. The key is to simultaneously accept touch input from multiple users. Earlier research (Hi-Space and ShareSpace from the University of Washington) used cameras to capture similar data. Multitouch technology can let multiple people sit at conference table screens and work in parallel. For more information, visit www2.carleton.ca/newsroom/news-articles/researchers-point-way-to-new-touch-screen-technology.
Computers have been equipped to accept numerous forms of human input. The vast majority in production use are still tactile. Some systems use sounds, such as voice, to varying degrees of success, especially for automated phone answering and dictation systems. A few, such as eye tracking, use cameras. Another form of input, the brain wave, has been used experimentally to produce art (for example, the Brainwave Chick; http://brainwavechick.com/pages/bwrevisited.html) and early game controllers (such as Biocontrol Systems BioWave and BioFlex). Brain waves have become attractive to large companies such as Intel as a method to control not only normal devices but also prosthetic devices.
Additional research at the University of Washington is mapping the regions of the brain that are stimulated during activities such as cursor control. This research has shown that users, after only 10 minutes of training, create significantly stronger brain signals when they imagine controlling a cursor than they do when physically using a mouse. Perhaps the Vulcan mind meld is closer to reality than we thought. For more information, visit www.physorg.com/news185470589.html.