loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2001 IEEE International Conference on Multimedia and Expo (ICME'01)
A COMPARISON OF MODEL AND TRANSFORM-BASED VISUAL FEATURES FOR AUDIO-VISUAL LVCSR
Tokyo, Japan
August 22-August 25
ISBN: 0-7695-1198-8
Iain Matthews, Robotics Institute Carnegie Mellon University
Gerasimos Potamianos, IBM T. J. Watson Research Center
Chalapathy Neti, IBM T. J. Watson Research Center
Juergen Luettin, Ascom Systec AG

Four different visual speech parameterisation methods are compared on a large vocabulary, continuous, audio-visual speech recognition task using the IBM ViaVoiceTM audio-visual speech database. Three are direct mouth image region based transforms; discrete cosine and wavelet transforms, and principal component analysis. The fourth uses a statistical model of shape and appearance called an active appearance model, to track and obtain model parameters describing the entire face.

All parameterisations are compared experimentally using hidden Markov models (HMM's) in a speaker independent test. Visualonly HMM's are used to rescore lattices obtained from audio models trained in noisy conditions.

Citation:
Iain Matthews, Gerasimos Potamianos, Chalapathy Neti, Juergen Luettin, "A COMPARISON OF MODEL AND TRANSFORM-BASED VISUAL FEATURES FOR AUDIO-VISUAL LVCSR," icme, pp.210, 2001 IEEE International Conference on Multimedia and Expo (ICME'01), 2001
Usage of this product signifies your acceptance of the Terms of Use.