This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Speech/Gesture Interface to a Visual-Computing Environment
March/April 2000 (vol. 20 no. 2)
pp. 29-37
Recent progress in 3D immersive display and virtual reality (VR) technologies has made possible many exciting applications. To fully exploit this potential requires "natural" interfaces that allow manipulating such displays without cumbersome attachments. In this article we describe using visual hand-gesture analysis and speech recognition for developing a speech/gesture interface to control a 3D display. The interface enhances an existing application, VMD, which is a VR visual computing environment for structural biology. The free-hand gestures manipulate the 3D graphical display, together with a set of speech commands. We found

1. M. Nelson et al., "MDScope—A Visual Computing Environment for Structural Biology," Computational Physics Communication, Vol. 91, Nos. 1-3, Jan. 1995, pp. 111-134.
2. V.I. Pavlovic, R. Sharman, and T.S. Huang, "Visual Interpretation for Human-Computer Interaction: A Review," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 677-695, July 1997.
3. J.G. Wilpon, L.R. Rabiner, C.-H. Lee, and E.R. Goldman, "Automatic Recognition of Keywords in Unconstrained Speech Using Hidden Markov Models," IEEE Trans. Acoust., Speech, Signal Processing, vol. 38, pp. 1,870-1,878, Nov. 1990.
4. J.M. Rehg and T. Kanade, Digiteyes: Vision-Based Human Hand Tracking," Tech. Report CMU-CS-93-220, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pa.,1993.
1. A.G. Hauptmann and P. McAvinney, "Gesture with Speech for Graphics Manipulation," Int'l J. Man-Machine Studies, Vol. 38, No. 2, Feb. 1993, pp. 231-249.
2. R. Bolt, “Put-That-There: Voice and Gesture at the Graphics Interface,” Computer Graphics, vol. 14, no. 3, pp. 262-270, 1980.
3. P.R. Cohen et al., "Quickset: Multimodal Interaction for Distributed Applications," Proc. 5th ACM Int'l Multimedia Conf., ACM Press, 1997, pp. 31-40.
4. J. Wang, "Integration of Eye-Gaze, Voice and Manual Response in Multimodal User Interface," Proc. IEEE Int'l Conf. Systems, Man, and Cybernetics, IEEE Press, Piscataway, N.J., 1995, pp. 3938-3942.
5. P. Maes et al., "ALIVE: Artificial Life Interactive Video Environment," Intercommunication, Vol. 7, Winter 1999, pp. 48-49.
6. A. Pentland, "Smart Rooms," Scientific American, Apr. 1996, pp. 54-62.
7. V.I. Pavlovic, R. Sharman, and T.S. Huang, "Visual Interpretation for Human-Computer Interaction: A Review," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 677-695, July 1997.

Citation:
Rajeev Sharma, Michael Zeller, Vladimir I. Pavlovic, Thomas S. Huang, Zion Lo, Stephen Chu, Yunxin Zhao, James C. Phillips, Klaus Schulten, "Speech/Gesture Interface to a Visual-Computing Environment," IEEE Computer Graphics and Applications, vol. 20, no. 2, pp. 29-37, March-April 2000, doi:10.1109/38.824531
Usage of this product signifies your acceptance of the Terms of Use.