3D Scene Generation
Free 3D scene generation AI tools for creating immersive environments for games, films, and virtual experiences with ease.
SceneFactor generates 3D scenes from text using an intermediate 3D semantic map. This map can be edited to add, remove, resize, and replace objects, allowing for easy regeneration of the final 3D scene.
LT3SD can generate large-scale 3D scenes using a method that captures both basic shapes and fine details. It allows for flexible output sizes and produces high-quality scenes, even completing missing parts of a scene.
ReStyle3D can transfer the look of a style image to real-world scenes from different angles. It keeps the structure and details intact, making it great for interior design and virtual staging.
GPS-Gaussian+ can render high-resolution 3D scenes from 2 or more input images in real-time.
PhysFlow can simulate dynamic interactions in complex scenes. It identifies material types through image queries and enhances realism using video diffusion and a Material Point Method for detailed 4D representations.
LVSM can generate high-quality 3D views of objects and scenes from a few input images.
VideoScene can generate 3D scenes from sparse video views in one step.
GeometryCrafter can recover detailed 3D point maps from open-world videos.
MVGenMaster can generate up to 100 new views from a single image using a multi-view diffusion model.
So far it has been tough to imagine the benefits of AI agents. Most of what we’ve seen from that domain has been focused on NPC simulations or solving text-based goals. 3D-GPT is a new framework that utilizes LLMs for instruction-driven 3D modeling by breaking down 3D modeling tasks into manageable segments to procedurally generate 3D scenes. I recently started to dig into Blender and I pray this gets open sourced at one point.
Google DeepMind has been researching 4DiM, a cascaded diffusion model for 4D novel view synthesis. It can generate 3D scenes with temporal dynamics from a single image and a set of camera poses and timestamps.
LayerPano3D can generate immersive 3D scenes from a single text prompt by breaking a 2D panorama into depth layers.
OmniPhysGS can generate realistic 3D dynamic scenes by modeling objects with Constitutive 3D Gaussians.
Wonderland can generate high-quality 3D scenes from a single image using a camera-guided video diffusion model. It allows for easy navigation and exploration of 3D spaces, performing better than other methods, especially with images it hasn’t seen before.
DAS3R can decompose scenes and rebuild static backgrounds from videos.
L4GM is a 4D Large Reconstruction Model that can turn a single-view video into an animated 3D object.
SelfSplat can create 3D models from multiple images without needing specific poses. It uses self-supervised methods for depth and pose estimation, resulting in high-quality appearance and geometry from real-world data.
Long-LRM can reconstruct large 3D scenes from up to 32 input images at 960x540 resolution in just 1.3 seconds on a single A100 80G GPU.
CityGaussianV2 can reconstruct large-scale scenes from multi-view RGB images with high accuracy.
PF3plat can generate photorealistic images and accurate camera positions from uncalibrated image collections.