AI Toolbox | AI Art Weekly

MARBLE

MARBLE can blend and change the material properties of objects in images using material embeddings in CLIP-space. It allows control over attributes like roughness, metallic, transparency, and glow, enabling multiple edits at once and supporting various artistic styles.

04.06.25 · Project Page · Code · Image Editing

Synergizing Motion and Appearance

Synergizing Motion and Appearance can generate high-quality talking head videos by combining facial identity from a source image with motion from a driving video.

03.06.25 · Project Page · Code · Video-to-Video · Controllable Video Generation · Image-to-Video · Talking Head Generation

Triangle Splatting for Real-Time Radiance Field Rendering

After NeRFs and Gaussian Splatting we got Triangle Splatting. A new method that can render real-time radiance fields at over 2,400 FPS with a 1280x720 resolution. It combines triangle representations with differentiable rendering for better visual quality and faster results than Gaussian splatting methods.

03.06.25 · Project Page · Code · 3D Scene Generation

UniTEX

UniTEX can generate high-quality textures for 3D assets without using UV mapping. It maps 3D points to texture values based on surface proximity and uses a transformer-based model for better texture quality.

03.06.25 · Code · Image-to-Texture · 3D Texture Generation · Image-to-3D

Generative Omnimatte

Generative Omnimatte can break down videos into meaningful layers, isolating objects, shadows, and reflections without needing static backgrounds. It uses a video diffusion model for high-quality results and can fill in hidden areas, enhancing video editing options.

03.06.25 · Project Page · Code · Video Inpainting · Video Editing

Direct3D-S2

Direct3D-S2 can generate high-resolution 3D shapes.

30.05.25 · Project Page · Code · Image-to-3D

MiniMax-Remover

MiniMax-Remover can remove objects from videos efficiently with just 6 sampling steps.

30.05.25 · Project Page · Code · · Video Object Detection · Video Editing

EPiC

EPiC can control video cameras in image-to-video and video-to-video tasks without needing many camera path details.

29.05.25 · Project Page · Code · Video-to-Video · Controllable Video Generation

SceneFactor

SceneFactor generates 3D scenes from text using an intermediate 3D semantic map. This map can be edited to add, remove, resize, and replace objects, allowing for easy regeneration of the final 3D scene.

28.05.25 · Project Page · Code · Text-to-3D · 3D Editing · 3D Scene Generation

RenderFormer

RenderFormer can render images from triangle mesh representations with full global illumination effects.

28.05.25 · Project Page · Code · 3D Scene Generation

DualParal

DualParal can generate minute-long videos.

28.05.25 · Project Page · Code · Text-to-Video

OmniPainter

OmniPainter can generate high-quality images that match a prompt and a style reference image in just 4 to 6 timesteps. It uses the self-consistency property of latent consistency models to ensure the results closely align with the style of the reference image.

27.05.25 · Project Page · Code · Text-to-Image · Image Style Transfer

LT3SD

LT3SD can generate large-scale 3D scenes using a method that captures both basic shapes and fine details. It allows for flexible output sizes and produces high-quality scenes, even completing missing parts of a scene.

26.05.25 · Project Page · Code · 3D Scene Generation

ReStyle3D

ReStyle3D can transfer the look of a style image to real-world scenes from different angles. It keeps the structure and details intact, making it great for interior design and virtual staging.

26.05.25 · Project Page · Code · 3D Style Transfer · 3D Editing · 3D Scene Generation

BAGEL

BAGEL is a unified multimodal model that can understand and generate images and text, excelling in tasks like image editing and predicting future frames. Basically the open-source version of GPT-4o.

23.05.25 · Project Page · Code · Image Editing

Uni3C

Uni3C is a video generation method that adds support for both camera controls and human motion in video generation.

21.05.25 · Project Page · Code · Controllable Video Generation

4K4DGen

4K4DGen can turn a single panorama image into an immersive 4D environment with 360-degree views at 4K resolution. The method is able to animate the scene and optimize a set of 4D Gaussians using efficient splatting techniques for real-time exploration.

20.05.25 · Project Page · Code · Image-to-3D · Video-to-4D

PixelHacker

PixelHacker can perform image inpainting with strong consistency in structure and meaning. It uses a diffusion-based model and a dataset of 14 million image-mask pairs, achieving better results than other methods in texture, shape, and color consistency.

20.05.25 · Project Page · Code · Image Inpainting

MVPainter

MVPainter can generate high-quality 3D textures by aligning reference textures with geometry.

20.05.25 · Project Page · Code · Image-to-Texture · 3D Texture Generation

MoCha

MoCha can generate talking character animations from speech and text, allowing for multi-character conversations with turn-based dialogue.

19.05.25 · Project Page · Code · Text-to-Video