The Community for Technology Leaders
Proceedings of 1994 28th Asilomar Conference on Signals, Systems and Computers (1994)
Pacific Grove, CA, USA
Oct. 31, 1994 to Nov. 2, 1994
ISSN: 1058-6393
ISBN: 0-8186-6405-3
pp: 587-590
R.A. Rao , Sch. of Electr. Eng., Georgia Inst. of Technol., Atlanta, GA, USA
R.M. Mersereau , Sch. of Electr. Eng., Georgia Inst. of Technol., Atlanta, GA, USA
ABSTRACT
In this paper, we describe an algorithm for modeling the shape of the mouth, and extracting meaningful dimensions for use by automatic lipreading systems. One advantage of this technique lies in the ability to normalize the model to compensate for scale and rotation. An error function is defined which relates the model to the image, and minimization of the error yields the best fit model. This is similar to deformable templates, but we attempt to perform the minimization in closed form. Visual only recognition was performed with features extracted from the model, and the recognition system achieved 85% accuracy on a two word discrimination task.<>
INDEX TERMS
speech processing, speech recognition, vision, image processing, feature extraction
CITATION

R. Rao and R. Mersereau, "Lip modeling for visual speech recognition," Proceedings of 1994 28th Asilomar Conference on Signals, Systems and Computers(ACSSC), Pacific Grove, CA, USA, 1995, pp. 587-590.
doi:10.1109/ACSSC.1994.471520
95 ms
(Ver 3.3 (11022016))