Issue No. 01 - Jan. (2014 vol. 36)
Ziheng Zhou , Dept. of Comput. Sci. & Eng., Univ. of Oulu, Oulu, Finland
Xiaopeng Hong , Dept. of Comput. Sci. & Eng., Univ. of Oulu, Oulu, Finland
Guoying Zhao , Dept. of Comput. Sci. & Eng., Univ. of Oulu, Oulu, Finland
Matti Pietikainen , Dept. of Comput. Sci. & Eng., Univ. of Oulu, Oulu, Finland
The problem of visual speech recognition involves the decoding of the video dynamics of a talking mouth in a high-dimensional visual space. In this paper, we propose a generative latent variable model to provide a compact representation of visual speech data. The model uses latent variables to separately represent the inter-speaker variations of visual appearances and those caused by uttering, and incorporates the structural information of the observed visual data within an utterance through modelling the structure using a path graph and placing variables' priors along its embedded curve.
Visualization, Hidden Markov models, Image sequences, Mouth, Speech, Speech recognition, Data models
Ziheng Zhou, Xiaopeng Hong, Guoying Zhao and M. Pietikainen, "A Compact Representation of Visual Speech Data Using Latent Variables," in IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 36, no. 1, pp. 1, 2013.