Synthesizing Novel Views with Diffusion Models

Nguyen, Brandon Gia Nghi

Abstract

Diffusion models have become the state-of-the-art generative model in a multitude of generative tasks such as audio and image synthesis. Until recently, there has not been much success with the specific image-to-image task of novel view synthesis; where a model is given a reference frame of a scene and is then queried about what the scene may look like from another, different, view. One of these recent developments is the 3DiM model, where a conditional diffusion model is modified with cross attention modules in order to leverage features across views in order to synthesize more consistent and higher quality novel views. In this work, we re-implement this 3DiM model with PyTorch to gauge its performance and to analyze its capabilities in synthesizing views with varying parameters and spatial resolutions.

URI

https://hdl.handle.net/1969.1/200255

Subject

Computer Vision
Machine Learning
Novel View Synthesis
Diffusion Model

Collections

Undergraduate Research Scholars Capstone (2006–present)

Citation

Nguyen, Brandon Gia Nghi (2023). Synthesizing Novel Views with Diffusion Models. Undergraduate Research Scholars Program. Available electronically from https : / /hdl .handle .net /1969 .1 /200255.