Overview
DreamFusion is a research project that leverages pre-trained 2D text-to-image diffusion models (like Imagen) to synthesize 3D models from text prompts. It circumvents the need for large-scale labeled 3D datasets by using a loss based on probability density distillation. The system optimizes a randomly initialized Neural Radiance Field (NeRF) via gradient descent, ensuring its 2D renderings from various angles align with the input text prompt. This is achieved through Score Distillation Sampling (SDS), which allows for optimization in a 3D parameter space. DreamFusion refines the NeRF's geometry and appearance through additional regularizers and optimization strategies, resulting in relightable 3D objects with high-fidelity normals, surface geometry, and depth. The generated NeRF models can be exported as meshes for integration into 3D renderers or modeling software.
