AI Toolbox | AI Art Weekly

MotionCtrl

MotionCtrl is a flexible motion controller that is able to manage both camera and object motions in the generated videos and can be used with VideoCrafter1, AnimateDiff Stable Video Diffusion.

06.12.23 · Project Page · Code · Video Editing

DPM-Solver

DPM-Solver can generate high-quality samples from diffusion probabilistic models in just 10 to 20 function evaluations. It is 4 to 16 times faster than previous methods and works with both discrete-time and continuous-time models without extra training.

06.12.23 · Code · Image Classification

AmbiGen

AmbiGen can generate ambigrams by optimizing letter shapes for clear reading from two angles. It improves word accuracy by over 11.6% and reduces edit distance by 41.9% on the 500 most common English words.

05.12.23 · Project Page · Code · Image Editing · Controllable Image Generation

Readout Guidance

Readout Guidance can control text-to-image diffusion models using lightweight networks called readout heads. It enables pose, depth, and edge-guided generation with fewer parameters and training samples, allowing for easier manipulation and consistent identity generation.

04.12.23 · Project Page · Code · Text-to-Image · Image Editing

X-Adapter

X-Adapter can enable pretrained plugins like ControlNet and LoRA from Stable Diffusion 1.5 to work with the SDXL model without retraining. It adds trainable mapping layers for feature remapping and uses a null-text training strategy to improve compatibility and functionality.

04.12.23 · Project Page · Code · Text-to-Image

Multi-Concept Customization of Text-to-Image Diffusion

Custom Diffusion can quickly fine-tune text-to-image diffusion models to generate new variations from just a few examples in about 6 minutes on 2 A100 GPUs. It allows for the combination of multiple concepts and requires only 75MB of storage for each additional model, which can be compressed to 5-15MB.

03.12.23 · Project Page · Code · Text-to-Image · Personalized Image Generation

DiffusionMat

DiffusionMat is a novel image matting framework that employs a diffusion model for the transition from coarse to refined alpha mattes. The key innovation of the framework is a correction module that adjusts the output at each denoising step, ensuring that the final result is consistent with the input image’s structures.

02.12.23 · Project Page · Code · Image Segmentation · Image Restoration · Image Editing

StyleCrafter

Given one or more style references, StyleCrafter can generate images and videos based on these referenced styles.

01.12.23 · Project Page · Code · Text-to-Video

4D-fy

4D-fy can generate high-quality 4D scenes from text prompts. It combines the strengths of text-to-image and text-to-video models to create dynamic scenes with great visual quality and realistic motion.

29.11.23 · Project Page · Code · Text-to-3D

Material Palette

Material Palette can extract a palette of PBR materials (albedo, normals, and roughness) from a single real-world image. Looks very useful for creating new materials for 3D scenes or even for generating textures for 2D art.

28.11.23 · Project Page · Code · Image Editing

Space-Time Diffusion Features for Zero-Shot Text-Driven Motion Transfer

Diffusion Motion Transfer is able to translate videos with a text prompt while maintaining the input video’s motion and scene layout.

28.11.23 · Project Page · Code · Text-to-Video

Sketch Video Synthesis

Sketch Video Synthesis can turn videos into SVG sketches using frame-wise Bézier curves. It allows for impressive visual effects like resizing, color filling, and adding doodles to original images while maintaining a smooth flow between frames.

26.11.23 · Project Page · Code · Video Editing

LucidDreamer

LucidDreamer can generate navigatable 3D Gaussian Splat scenes out of a single text prompt of a single image. Text prompts can also be chained for more output control. Can’t wait until they can also be animated.

22.11.23 · Project Page · Code · 3D Scene Generation 3D Object Generation

Breathing Life Into Sketches Using Text-to-Video Priors

LiveSketch can automatically add motion to a single-subject sketch by providing a text prompt indicating the desired motion. The output are short SVG animations which can be easily edited.

21.11.23 · Project Page · Code · Text-to-Video · Image-to-Video

PhysGaussian

PhysGaussian is a simulation-rendering pipeline that can simulate the physics of 3D Gaussian Splats while simultaneously render photorealistic results. The method supports flexible dynamics, a diverse range of materials as well as collisions.

20.11.23 · Project Page · Code · 3D Object Generation · 3D Scene Generation

Concept Sliders

Concept Sliders is a method that allows for fine-grained control over textual and visual attributes in Stable Diffusion XL. By using simple text descriptions or a small set of paired images, artists can train concept sliders to represent the direction of desired attributes. At generation time, these sliders can be used to control the strength of the concept in the image, enabling nuanced tweaking.

20.11.23 · Project Page · Code · Image Editing Controllable Image Generation Image Style Transfer

LucidDreamer

LucidDreamer is a text-to-3D generation framework that is able to generate 3D models with high-quality textures and shapes. Higher quality means longer inference. This one takes 35 minutes on an A100 GPU.

19.11.23 · Code · Text-to-3D

MagicPose

It’s been a while since I last doomed the TikTok dancers. MagicDance is gonna doom them some more. This model can combine human motion with reference images to precisely generate appearance-consistent videos. While the results still contain visible artifacts and jittering, give it a few months and I’m sure we can’t tell the difference no more.

18.11.23 · Project Page · Code · Image Editing · Personalized Image Generation · Controllable Image Generation

The Chosen One

[The Chosen One] can generate consistent characters in text-to-image diffusion models using just a text prompt. It improves character identity and prompt alignment, making it useful for story visualization, game development, and advertising.

16.11.23 · Project Page · Code · Text-to-Image

3D Paintbrush

3D Paintbrush can automatically add textures to specific areas on 3D models using text descriptions. It produces detailed localization and texture maps, enhancing the quality of graphics in various projects.

16.11.23 · Project Page · Code · 3D Editing · 3D Texture Generation