AI Toolbox

Image AI Tools

Free image AI tools for generating and editing visuals, creating 3D assets for games, films, and more, optimizing your creative projects.

Image AI Tools

3D Editing 3D Object Generation 3D Scene Generation Brain-to-Image Controllable Image Generation Image Captioning Image Classification Image Colorization Image Depth Estimation Image Editing Image Editing Controllable Image Generation Image Style Transfer Image Generation Image Inpainting Image Inpainting Image Editing Image Object Detection Image Relighting Image Restoration Image Segmentation Image Style Transfer Image-to-3D Image-to-Depth Image-to-Image Image-to-Sketch Image-to-Text Image-to-Video Image Upscaling Personalized Image Generation Text-to-Image Text-to-Image Personalized Image Generation Video Captioning Video Editing Virtual Image Try-On

Visual Style Prompting with Swapping Self-Attention

Visual Style Prompting can generate images with a specific style from a reference image. Compared to other methods like IP-Adapter and LoRAs, Visual Style Prompting is better at retainining the style of the referenced image while avoiding style leakage from text prompts.

20.02.24 · Project Page · Code · Text-to-Image · Image Style Transfer

Learning Continuous 3D Words for Text-to-Image Generation

Continuous 3D Words is a control method that can modify attributes in images with a slider based approach. This allows for more control over illumination, non-rigid shape changes (like wings), and camera orientation for instance.

13.02.24 · Project Page · Code · Text-to-Image · Image Editing

Repositioning the Subject within Image

SEELE can move around objects within an image. It does so by removing it, inpainting occluded portions and harmonizing the appearance of the repositioned object with the surrounding areas.

30.01.24 · Project Page · Code · Image Inpainting · Image Editing

StableIdentity

StableIdentity is a method that can generate diverse customized images in various contexts from a single input image. The cool thing about this method is, that it is able to combine the learned identity with ControlNet and even inject it into video (ModelScope) and 3D (LucidDreamer) generation.

29.01.24 · Project Page · Code · Personalized Image Generation · Image Editing

pix2gestalt

pix2gestalt is able to estimate the shape and appearance of whole objects that are only partially visible behind occlusions.

25.01.24 · Project Page · Code · Image Segmentation · Image Restoration · Image Object Detection

Depth Anything

Depth Anything is a new monocular depth estimation method. The model is trained on 1.5M labeled images and 62M+ unlabeled images, which results in impressive generalization ability.

19.01.24 · Project Page · Code · Image-to-Depth

InstantID

InstantID is a ID embedding-based method that can be used to personalize images in various styles using just a single facial image, while ensuring high fidelity.

15.01.24 · Code · Personalized Image Generation · Image Editing · Image Style Transfer

FlexGen

FlexGen can generate high-quality, multi-view images from a single-view image or text prompt. It lets users change unseen areas and adjust material properties like metallic and roughness, improving control over the final image.

12.01.24 · Project Page · Code · Text-to-Image · Image-to-Image · Controllable Image Generation

PIA

PIA is a method that can animate images generated by custom Stable Diffusion checkpoints with realistic motions based on a text prompt.

21.12.23 · Project Page · Code · Text-to-Image · Image-to-Video

Intrinsic Image Diffusion for Indoor Single-view Material Estimation

Intrinsic Image Diffusion can generate detailed albedo, roughness, and metallic maps from a single indoor scene image.

19.12.23 · Project Page · Code · Image-to-Image

DiffusionLight

DiffusionLight can estimate the lighting in a single input image and convert it into an HDR environment map. The technique is able to generate multiple chrome balls with varying exposures for HDR merging and can be used to seamlessly insert 3D objects into an existing photograph. Pretty cool.

14.12.23 · Project Page · Code · Image Restoration

ControlNet-XS

ControlNet-XS can control text-to-image diffusion models like Stable Diffusion and Stable Diffusion-XL with only 1% of the parameters of the base model. It is about twice as fast as ControlNet and produces higher quality images with better control.

11.12.23 · Project Page · Code · Text-to-Image · Image Editing

LayerPeeler

LayerPeeler can remove hidden layers from images and create vector graphics with clear paths and organized layers.

08.12.23 · Project Page · Code · Image Editing

PhotoMaker

PhotoMaker can generate realistic human photos from input images and text prompts. It can change attributes of people, like changing hair colour and adding glasses, turn people from artworks like Van Gogh’s self-portrait into realistic photos, or mix identities of multiple people.

07.12.23 · Project Page · Code · Text-to-Image Personalized Image Generation

DPM-Solver

DPM-Solver can generate high-quality samples from diffusion probabilistic models in just 10 to 20 function evaluations. It is 4 to 16 times faster than previous methods and works with both discrete-time and continuous-time models without extra training.

06.12.23 · Code · Image Classification

AmbiGen

AmbiGen can generate ambigrams by optimizing letter shapes for clear reading from two angles. It improves word accuracy by over 11.6% and reduces edit distance by 41.9% on the 500 most common English words.

05.12.23 · Project Page · Code · Image Editing · Controllable Image Generation

Readout Guidance

Readout Guidance can control text-to-image diffusion models using lightweight networks called readout heads. It enables pose, depth, and edge-guided generation with fewer parameters and training samples, allowing for easier manipulation and consistent identity generation.

04.12.23 · Project Page · Code · Text-to-Image · Image Editing

X-Adapter

X-Adapter can enable pretrained plugins like ControlNet and LoRA from Stable Diffusion 1.5 to work with the SDXL model without retraining. It adds trainable mapping layers for feature remapping and uses a null-text training strategy to improve compatibility and functionality.

04.12.23 · Project Page · Code · Text-to-Image

Multi-Concept Customization of Text-to-Image Diffusion

Custom Diffusion can quickly fine-tune text-to-image diffusion models to generate new variations from just a few examples in about 6 minutes on 2 A100 GPUs. It allows for the combination of multiple concepts and requires only 75MB of storage for each additional model, which can be compressed to 5-15MB.

03.12.23 · Project Page · Code · Text-to-Image · Personalized Image Generation

DiffusionMat

DiffusionMat is a novel image matting framework that employs a diffusion model for the transition from coarse to refined alpha mattes. The key innovation of the framework is a correction module that adjusts the output at each denoising step, ensuring that the final result is consistent with the input image’s structures.

02.12.23 · Project Page · Code · Image Segmentation · Image Restoration · Image Editing