Second IEEE International Conference on Automatic Face and Gesture Recognition (FG '96) LISTEN: A System for Locating and Tracking Individual Speakers Killington, Vermont October 14-October 16 ISBN: 0-8186-7713-9
Both visual and acoustical informations provide effective means of telecommunication between persons. In this context, the face is the most important part of the person both visually and acoustically. We describe how the cooperation of image and audio processing allows to track a person's face and to collect the audio information it produces. We present detection techniques of regions of interest (e.g. moving regions of skin color), coupled with a neural network based face detector with a low false alarm rate, to locate and track faces. The system is connected to a nine microphone array adaptive beamforming which performs immediate beamforming. Visual and acoustical informations from the speaker face are thus obtained in real time.
Index Terms:
Face Detection, Face Tracking, Computer vision, Neural networks, Microphone array.
Citation:
M. Collobert, R. Feraud, G. Le Tourneur, O. Bernier, J. E. Viallet, Y. Mahieux, D. Collobert, "LISTEN: A System for Locating and Tracking Individual Speakers," fg, pp.283, Second IEEE International Conference on Automatic Face and Gesture Recognition (FG '96), 1996 Usage of this product signifies your acceptance of the Terms of Use. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||