Fourth IEEE International Conference on Multimodal Interfaces (ICMI'02)
Achieving Real-Time Lip Synch via SVM-Based Phoneme Classification and Lip Shape Refinement
Pittsburgh, Pennsylvania
October 14-October 16
ISBN: 0-7695-1834-6
In this paper, we develop a real time lip-synch system that activates 2-D avatar?s lip motion in synch with incoming speech utterance. To realize the "real time" operation of the system, we contain the processing time by invoking merge and split procedure performing coarse-to-fine phoneme classification. At each stage of phoneme classification, we apply the support vector machine (SVM) to constrain the computational load while attaining the desirable accuracy. The coarse-to-fine phoneme classification is accomplished via 2 stages of feature extraction, where each speech frame is acoustically analyzed first for 3 classes of lip opening using MFCC as feature and then a further refined classification for detailed lip shape using formant information. We implemented the system with a 2-D lip animation that shows the effectiveness of the proposed 2-stage procedure accomplishing the real-time lip-synch task.
Citation:
Taeyoon Kim, Yongsung Kang, Hanseok Ko, "Achieving Real-Time Lip Synch via SVM-Based Phoneme Classification and Lip Shape Refinement," icmi, pp.299, Fourth IEEE International Conference on Multimodal Interfaces (ICMI'02), 2002