Abstract
Automating the task of lip synchronization has long been an interesting yet challenging problem, especially for animation that is needed "on the fly," in real-time. This thesis presents a method for creating a computer program that simplifies the process of generating real-time lip sync animation, while keeping the resulting animation as believable as possible. Using a single audio voice track or live microphone input, the implemented program extracts distinguishing features from the audio signal, specifically LPC cepstral coefficients, gain, and zero-crossing rate. These features are used as input into a trained three-layer feedforward back-propagation neural network for phonetic classification per frame of animation. The training of the neural network is speaker-dependant, accurately matching only the speaker who contributed to the sound samples in the training set.
Haaser, Christina Marie (2002). Automatic real-time lip synchronization using LPC analysis and neural networks. Master's thesis, Texas A&M University. Available electronically from
https : / /hdl .handle .net /1969 .1 /ETD -TAMU -2002 -THESIS -H18.