AI Toolbox | AI Art Weekly

MeshAnything V2 can generate 3D meshes from point clouds, meshes, images, text and more.

06.08.24 · Project Page · Code · 3D Mesh Generation

Lumina-mGPT can create photorealistic images from text and handle different visual and language tasks! It uses a special transformer model, making it possible to control image generation, do segmentation, estimate depth, and answer visual questions in multiple steps.

06.08.24 · Code · Demo · Text-to-Image

Feature Splatting

And talking about Splats, Feature Splatting can manipulate both the appearance and the physical properties of objects in a 3D scene using text prompts.

05.08.24 · Project Page · Code · Text-to-3D · 3D Scene Generation

VAR-CLIP

VAR-CLIP creates detailed fantasy images that match text descriptions closely by combining Visual Auto-Regressive techniques with CLIP! It uses text embeddings to guide image creation, ensuring strong results by training on a large image-text dataset.

05.08.24 · Code · Text-to-Image

CityGaussian

CityGaussian can render large-scale 3D scenes in real-time using a divide-and-conquer training approach and Level-of-Detail strategy. It achieves high-quality rendering at an average speed of 36 FPS on an A100 GPU.

04.08.24 · Project Page · Code · 3D Scene Generation · 3D Object Detection

Perm

Perm can generate and manipulate 3D hairstyles. It enables applications such as 3D hair parameterization, hairstyle interpolation, single-view hair reconstruction, and hair-conditioned image generation.

03.08.24 · Project Page · Code · 3D Hair Generation

SV4D 2.0

SV4D 2.0 can generate high-quality 4D models and videos from a reference video.

03.08.24 · Project Page · Code · Video-to-4D

Smoothed Energy Guidance

SEG improves image generation for SDXL by smoothing the self-attention energy landscape! This boosts quality without needing guidance scale, using a query blurring method that adjusts attention weights, leading to better results with fewer drawbacks.

02.08.24 · Code · Controllable Image Generation

SMooDi

SMooDi can generate stylized motion from text prompts and style motion sequences.

01.08.24 · Project Page · Code · Text-to-Motion

Interactive3D

Interactive3D can generate high-quality 3D objects that users can easily modify. It allows for adding and removing parts, dragging objects, and changing shapes.

01.08.24 · Project Page · Code · 3D Object Generation

XHand

XHand can generate high-fidelity hand shapes and textures in real-time, enabling expressive hand avatars for virtual environments.

31.07.24 · Project Page · Code · 3D Mesh Generation · Video-to-3D

DreamMover

DreamMover can generate high-quality intermediate images and short videos from image pairs with large motion. It uses a flow estimator based on diffusion models to keep details and ensure consistency between frames and input images.

31.07.24 · Project Page · Code · Image-to-Image

AniTalker

AniTalker is another talking head generator that can animate talking faces from a single portrait and input audio with naturally flowing movements and diverse outcomes.

30.07.24 · Project Page · Code · Talking Head Generation

Magic Clothing

Magic Clothing can generate customized characters wearing specific garments from diverse text prompts while preserving the details of the target garments and maintain faithfulness to the text prompts.

29.07.24 · Code · Text-to-Image · Image Editing

Audio-Synchronized Visual Animation

Audio-Synchronized Visual Animation can animate static images using audio clips to create synchronized visual animations. It uses the AVSync15 dataset and the AVSyncD diffusion model to produce high-quality animations across different audio types.

26.07.24 · Project Page · Code · Audio-to-Video

ClickDiff

ClickDiff can generate controllable grasps for 3D objects. It employs a Dual Generation Framework to produce realistic grasps based on user-specified or algorithmically predicted contact maps.

25.07.24 · Code · 3D Object Generation

ViPer

ViPer can personalize image generation by capturing individual user preferences through a one-time commenting process on a selection of images. It utilizes these preferences to guide a text-to-image model, resulting in generated images that align closely with users’ visual tastes.

25.07.24 · Project Page · Code · Personalized Image Generation

Magic Fixup

Adobe’s Magic Fixup lets you edit images with a cut-and-paste approach that fixes edits automatically. Can see this being super useful for generating animation frames for tools like AnimateDiff. But it’s not clear yet if or when this hits Photoshop.

25.07.24 · Project Page · Code · Image Editing · Image Restoration · Image Segmentation

SV4D

SV4D can generate dynamic 3D content from a single video. It ensures that the new views are consistent across multiple frames and achieves high-quality results in video synthesis.

24.07.24 · Project Page · Code · Model · Video-to-4D · 3D Object Generation

Artist

Artist stylizes images based on text prompts, preserving the original content while producing high aesthetic quality results. No finetuning, no ControlNets, it just works with your pretrained StableDiffusion model.

23.07.24 · Project Page · Code · Image Style Transfer · Controllable Image Generation