CSDL Home IEEE Transactions on Pattern Analysis & Machine Intelligence 2014 vol.36 Issue No.01 - Jan.
Issue No.01 - Jan. (2014 vol.36)
Ziheng Zhou , University of Oulu, Oulu
Xiaopeng Hong , University of Oulu, Oulu
Guoying Zhao , University of Oulu, Oulu
Matti Pietikainen , University of Oulu, Oulu
The problem of visual speech recognition involves the decoding of the video dynamics of a talking mouth in a high-dimensional visual space. In this paper, we propose a generative latent variable model to provide a compact representation of visual speech data. The model uses latent variables to separately represent the inter-speaker variations of visual appearances and those caused by uttering, and incorporates the structural information of the observed visual data within an utterance through modelling the structure using a path graph and placing variables' priors along its embedded curve.
Visualization, Hidden Markov models, Image sequences, Mouth, Speech, Speech recognition, Data models, Pattern analysis, Visualization, Hidden Markov models, Image sequences, Mouth, Speech, Speech recognition, Data models, Computer vision, Representations, data structures, and transforms
Ziheng Zhou, Xiaopeng Hong, Guoying Zhao, Matti Pietikainen, "A Compact Representation of Visual Speech Data Using Latent Variables", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.36, no. 1, pp. 1, Jan. 2014, doi:10.1109/TPAMI.2013.173