loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2001 IEEE International Conference on Multimedia and Expo (ICME'01)
SPEECH DETECTION BY FACIAL IMAGE FOR MULTIMODAL SPEECH RECOGNITION
Tokyo, Japan
August 22-August 25
ISBN: 0-7695-1198-8
K. Murai, ATR Spoken Language Translation Research Laboratories
K. Kumatani, ATR Spoken Language Translation Research Laboratories
S. Nakamura, ATR Spoken Language Translation Research Laboratories
In this paper, we propose a method to detect speech by facial images for multi-modal speech recognition. It is widely acknowledged that the accuracy of speech detection contributes to overall speech recognition performance. While audio modal speech detection performs well under clean conditions, the performance degrades with audio noise. So, we have conducted research on video modality speech detection, which is robust not only to the audio noise but also to the speaker's motion and other video modality disturbances[1]. However, accuracy of detection suffers because duration of the speech motion is intrinsically longer than the speech. Thus, the proposed method detects the section that includes the speech by means of robust video modality speech detection followed by audio modality speech detection to enhance the accuracy. Our method locates the face area by skin color and estimates the region that includes the speech organs. Then the speech is detected from the magnitude of the image alternation without explicitly detecting any organs. An experiment also confirms that the proposed method improves the speech recognition rate under a noisy environment (SNR 10dB) as well as the audio track of a VCR (SNR 25.4 dB).
Citation:
K. Murai, K. Kumatani, S. Nakamura, "SPEECH DETECTION BY FACIAL IMAGE FOR MULTIMODAL SPEECH RECOGNITION," icme, pp.275, 2001 IEEE International Conference on Multimedia and Expo (ICME'01), 2001
Usage of this product signifies your acceptance of the Terms of Use.