The Community for Technology Leaders
Green Image
Issue No. 01 - Jan. (2014 vol. 36)
ISSN: 0162-8828
pp: 1
Ziheng Zhou , Dept. of Comput. Sci. & Eng., Univ. of Oulu, Oulu, Finland
Xiaopeng Hong , Dept. of Comput. Sci. & Eng., Univ. of Oulu, Oulu, Finland
Guoying Zhao , Dept. of Comput. Sci. & Eng., Univ. of Oulu, Oulu, Finland
Matti Pietikainen , Dept. of Comput. Sci. & Eng., Univ. of Oulu, Oulu, Finland
ABSTRACT
The problem of visual speech recognition involves the decoding of the video dynamics of a talking mouth in a high-dimensional visual space. In this paper, we propose a generative latent variable model to provide a compact representation of visual speech data. The model uses latent variables to separately represent the inter-speaker variations of visual appearances and those caused by uttering, and incorporates the structural information of the observed visual data within an utterance through modelling the structure using a path graph and placing variables' priors along its embedded curve.
INDEX TERMS
Visualization, Hidden Markov models, Image sequences, Mouth, Speech, Speech recognition, Data models,Pattern analysis, Visualization, Hidden Markov models, Image sequences, Mouth, Speech, Speech recognition, Data models, Computer vision, Representations, data structures, and transforms
CITATION
Ziheng Zhou, Xiaopeng Hong, Guoying Zhao, Matti Pietikainen, "A Compact Representation of Visual Speech Data Using Latent Variables", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 36, no. , pp. 1, Jan. 2014, doi:10.1109/TPAMI.2013.173
183 ms
(Ver )