Automatic 3D Facial Performance Acquisition and Animation using Monocular Videos

Shi, Fuhao

dc.contributor.advisor	Chai, Jinxiang
dc.creator	Shi, Fuhao
dc.date.accessioned	2017-08-21T14:31:52Z
dc.date.available	2019-05-01T06:08:58Z
dc.date.created	2017-05
dc.date.issued	2017-01-05
dc.date.submitted	May 2017
dc.identifier.uri	https://hdl.handle.net/1969.1/161283
dc.description.abstract	Facial performance capture and animation is an essential component of many applications such as movies, video games, and virtual environments. Video-based facial performance capture is particularly appealing as it offers the lowest cost and the potential use of legacy sources and uncontrolled videos. However, it is also challenging because of complex facial movements at different scales, ambiguity caused by the loss of depth information, and a lack of discernible features on most facial regions. Unknown lighting conditions and camera parameters further complicate the problem. This dissertation explores the video-based 3D facial performance capture systems that use a single video camera, overcome the challenges aforementioned, and produce accurate and robust reconstruction results. We first develop a novel automatic facial feature detection/tracking algorithm that accurately locates important facial features across the entire video sequence, which are then used for 3D pose and facial shape reconstruction. The key idea is to combine the respective powers of local detection, spatial priors for facial feature locations, Active Appearance Models (AAMs), and temporal coherence for facial feature detection. The algorithm runs in realtime and is robust to large pose and expression variations and occlusions. We then present an automatic high-fidelity facial performance capture system that works on monocular videos. It uses the detected facial features along with multilinear facial models to reconstruct 3D head poses and large-scale facial deformation, and uses per-pixel shading cues to add fine-scale surface details such as emerging or disappearing wrinkles and folds. We iterate the reconstruction procedure on large-scale facial geometry and fine-scale facial details to improve the accuracy of facial reconstruction. We further improve the accuracy and efficiency of the large-scale facial performance capture by introducing a local binary feature based 2D feature regression and a convolutional neural network based pose and expression regression, and complement it with an efficient 3D eye gaze tracker to achieve realtime 3D eye gaze animation. We have tested our systems on various monocular videos, demonstrating the accuracy and robustness under a variety of uncontrolled lighting conditions and overcoming significant shape differences across individuals.	en
dc.format.mimetype	application/pdf
dc.language.iso	en
dc.subject	Facial Capture	en
dc.subject	Realtime Animation	en
dc.title	Automatic 3D Facial Performance Acquisition and Animation using Monocular Videos	en
dc.type	Thesis	en
thesis.degree.department	Computer Science and Engineering	en
thesis.degree.discipline	Computer Science	en
thesis.degree.grantor	Texas A & M University	en
thesis.degree.name	Doctor of Philosophy	en
thesis.degree.level	Doctoral	en
dc.contributor.committeeMember	Keyser, John
dc.contributor.committeeMember	Schaefer, Scott
dc.contributor.committeeMember	Akleman, Ergun
dc.type.material	text	en
dc.date.updated	2017-08-21T14:31:52Z
local.embargo.terms	2019-05-01
local.etdauthor.orcid	0000-0002-3460-5820

Files in this item

Name:: SHI-DISSERTATION-2017.pdf
Size:: 53.97Mb
Format:: PDF

View/ Open

This item appears in the following Collection(s)

Electronic Theses, Dissertations, and Records of Study (2002– )
Texas A&M University Theses, Dissertations, and Records of Study (2002– )

Show simple item record