Fourth IEEE International Conference on Multimodal Interfaces (ICMI'02) Audiovisual Arrays for Untethered Spoken Interfaces Pittsburgh, Pennsylvania October 14-October 16 ISBN: 0-7695-1834-6
When faced with a distant speaker at a known location in a noisy environment, a microphone array can provide a significantly improved audio signal for speech recognition. Estimating the location of a speaker in a reverberant environment from audio information alone can be quite difficult, so we use an array of video cameras to aid localization. Stereo processing techniques are used on pairs of cameras, and foreground 3-D points are grouped to estimate the trajectory of people as they move in an environment. These trajectories are used to guide a microphone array beamformer. Initial results using this system for speech recognition demonstrate increased recognition rates compared to non-array processing techniques.
Citation:
Kevin Wilson, Vibhav Rangarajan, Neal Checka, Trevor Darrell, "Audiovisual Arrays for Untethered Spoken Interfaces," icmi, pp.389, Fourth IEEE International Conference on Multimodal Interfaces (ICMI'02), 2002 Usage of this product signifies your acceptance of the Terms of Use. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||