Synthesizing Novel Views with Diffusion Models
Abstract
Diffusion models have become the state-of-the-art generative model in a multitude of generative tasks such as audio and image synthesis. Until recently, there has not been much success with the specific image-to-image task of novel view synthesis; where a model is given a reference frame of a scene and is then queried about what the scene may look like from another, different, view. One of these recent developments is the 3DiM model, where a conditional diffusion model is modified with cross attention modules in order to leverage features across views in order to synthesize more consistent and higher quality novel views. In this work, we re-implement this 3DiM model with PyTorch to gauge its performance and to analyze its capabilities in synthesizing views with varying parameters and spatial resolutions.
Citation
Nguyen, Brandon Gia Nghi (2023). Synthesizing Novel Views with Diffusion Models. Undergraduate Research Scholars Program. Available electronically from https : / /hdl .handle .net /1969 .1 /200255.