2006 IEEE International Conference on Multimedia and Expo
Acoustically-Driven Talking Face Synthesis using Dynamic Bayesian Networks
Toronto, ON, Canada
July 09-July 12
ISBN: 1-4244-0366-7
Jianxia Xue, University of California, Los Angeles, CA 90095, USA. jxue@ee.ucla.edu
Jonas Borgstrom, University of California, Los Angeles, CA 90095, USA. jonas@ee.ucla.edu
Jintao Jiang, House Ear Institute, Los Angeles, CA 90057, USA. jjiang@hei.org
Lynne Bernstein, House Ear Institute, Los Angeles, CA 90057, USA. lbernstein@hei.org
Abeer Alwan, University of California, Los Angeles, CA 90095, USA. alwan@ee.ucla.edu
Dynamic Bayesian Networks (DBNs) have been widely studied in multi-modal speech recognition applications. Here, we introduce DBNs into an acoustically-driven talking face synthesis system. Three prototypes of DBNs, namely independent, coupled, and product HMMs were studied. Results showed that the DBN methods were more effective in this study than a multilinear regression baseline. Coupled and product HMMs performed similarly better than independent HMMs in terms of motion trajectory accuracy. Audio and visual speech asynchronies were represented differently for coupled HMMs versus product HMMs.
Citation:
Jianxia Xue, Jonas Borgstrom, Jintao Jiang, Lynne Bernstein, Abeer Alwan, "Acoustically-Driven Talking Face Synthesis using Dynamic Bayesian Networks," icme, pp.1165-1168, 2006 IEEE International Conference on Multimedia and Expo, 2006