Introducing Stable Video 3D: Quality Novel View Synthesis and 3D Generation from Single Images

Tags

see all

Introducing Stable Video 3D: Quality Novel View Synthesis and 3D Generation from Single Images — Stability AI

Stability AI has introduced Stable Video 3D (SV3D), a new model that significantly advances the field of 3D technology by transforming single object images into novel multi-views and subsequently generating 3D meshes. Unlike its predecessors and other open-source alternatives, SV3D offers greatly improved quality and versatility in 3D generation. The model comes in two variants, SV3D_u and SV3D_p, catering to different usage scenarios. With its capabilities in novel view synthesis and multi-view consistency, SV3D aims to improve generalization, pose-controllability, and realistic 3D output from single images.

Main Points

Introduction of Stable Video 3D

Stable Video 3D is introduced by Stability AI, featuring new capabilities in quality novel view synthesis and 3D generation from single images.

Enhanced 3D Mesh Generation

It utilizes novel multi-view generation to enhance the creation and realism of 3D meshes from single image inputs.

Improved 3D Technology and Capabilities

The new model advances 3D technology with significant improvements and introduces variants SV3D_u and SV3D_p for diverse applications.

Insights

SV3D transforms a single object image into novel multi-views and then generates 3D meshes.

SV3D takes a single object image as input and output novel multi-views of that object. We can then use those novel-views and SV3D to generate 3D meshes.

Stable Video 3D delivers greatly improved quality and multi-view capabilities compared to previous models.

This new model advances the field of 3D technology, delivering greatly improved quality and multi-view when compared to the previously released Stable Zero123, as well as outperforming other open source alternatives such as Zero123-XL.

Stable Video 3D offers two variants: SV3D_u and SV3D_p, each providing unique capabilities.

This release features two variants: SV3D_u: This variant generates orbital videos based on single image inputs without camera conditioning. SV3D_p: Extending the capability of SVD3_u, this variant accommodates both single images and orbital views, allowing for the creation of 3D video along specified camera paths.

Stable Video 3D significantly advances novel view synthesis (NVS), providing coherent views from any angle.

Stable Video 3D introduces significant advancements in 3D generation, particularly in novel view synthesis (NVS). Unlike previous approaches that often grapple with limited perspectives and inconsistencies in outputs, Stable Video 3D is able to deliver coherent views from any given angle with proficient generalization.

Improves quality of 3D meshes through multi-view consistency and a disentangled illumination model.

Stable Video 3D leverages its multi-view consistency to optimize 3D Neural Radiance Fields (NeRF) and mesh representations to improve the quality of 3D meshes generated directly from novel views. For this, we have designed a masked score distillation sampling loss to further enhance 3D quality in regions not visible in the predicted views. Additionally, in order to reduce the issue of baked-in lighting, Stable Video 3D employs a disentangled illumination model that is jointly optimized along with 3D shape and texture.

URL

https://stability.ai/news/introducing-stable-video-3d