This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
A Statistical Quality Model for Data-Driven Speech Animation
Nov. 2012 (vol. 18 no. 11)
pp. 1915-1927
Xiaohan Ma, Dept. of Comput. Sci., Univ. of Houston, Houston, TX, USA
Zhigang Deng, Dept. of Comput. Sci., Univ. of Houston, Houston, TX, USA
In recent years, data-driven speech animation approaches have achieved significant successes in terms of animation quality. However, how to automatically evaluate the realism of novel synthesized speech animations has been an important yet unsolved research problem. In this paper, we propose a novel statistical model (called SAQP) to automatically predict the quality of on-the-fly synthesized speech animations by various data-driven techniques. Its essential idea is to construct a phoneme-based, Speech Animation Trajectory Fitting (SATF) metric to describe speech animation synthesis errors and then build a statistical regression model to learn the association between the obtained SATF metric and the objective speech animation synthesis quality. Through delicately designed user studies, we evaluate the effectiveness and robustness of the proposed SAQP model. To the best of our knowledge, this work is the first-of-its-kind, quantitative quality model for data-driven speech animation. We believe it is the important first step to remove a critical technical barrier for applying data-driven speech animation techniques to numerous online or interactive talking avatar applications.
Index Terms:
speech synthesis,computer animation,regression analysis,speech processing,interactive talking avatar applications,statistical quality model,data-driven speech animation approach,animation quality,SAQP,novel statistical model,on-the-fly synthesized speech animations,data-driven techniques,speech animation trajectory fitting metric,SATF,statistical regression model,Animation,Speech,Trajectory,Measurement,Principal component analysis,Predictive models,Face,statistical models,Facial animation,data-driven,visual speech animation,lip-sync,quality prediction
Citation:
Xiaohan Ma, Zhigang Deng, "A Statistical Quality Model for Data-Driven Speech Animation," IEEE Transactions on Visualization and Computer Graphics, vol. 18, no. 11, pp. 1915-1927, Nov. 2012, doi:10.1109/TVCG.2012.67
Usage of this product signifies your acceptance of the Terms of Use.