AI Toolbox

Image AI Tools

Free image AI tools for generating and editing visuals, creating 3D assets for games, films, and more, optimizing your creative projects.

Image AI Tools

3D Editing 3D Object Generation 3D Scene Generation Brain-to-Image Controllable Image Generation Image Captioning Image Classification Image Colorization Image Depth Estimation Image Editing Image Editing Controllable Image Generation Image Style Transfer Image Generation Image Inpainting Image Inpainting Image Editing Image Object Detection Image Relighting Image Restoration Image Segmentation Image Style Transfer Image-to-3D Image-to-Depth Image-to-Image Image-to-Sketch Image-to-Text Image-to-Video Image Upscaling Personalized Image Generation Text-to-Image Text-to-Image Personalized Image Generation Video Captioning Video Editing Virtual Image Try-On

ConsistentID

ConsistentID can generate diverse personalized ID images from text prompts using just one reference image. It improves identity preservation with a facial prompt generator and an ID-preservation network, ensuring high quality and variety in the generated images.

25.04.24 · Project Page · Code · Personalized Image Generation · Image Restoration

MaGGIe

MaGGIe can efficiently predict high-quality human instance mattes from coarse binary masks for both image and video input. The method is able to output all instance mattes simultaneously without exploding memory and latency, making it suitable for real-time applications.

24.04.24 · Project Page · Code · Image Segmentation · Image Restoration

PuLID

Similar to ConsistentID, PuLID is a tuning-free ID customization method for text-to-image generation. This one can also be used to edit images generated by diffusion models by adding or changing the text prompt.

24.04.24 · Code · Text-to-Image · Image Editing

CharacterFactory

CharacterFactory can generate endless characters that look the same across different images and videos. It uses GANs and word embeddings from celebrity names to ensure characters stay consistent, making it easy to integrate with other models.

24.04.24 · Project Page · Code · Image-to-Image · Image Editing · Personalized Image Generation

From Parts to Whole

Parts2Whole can generate customized human portraits from multiple reference images, including pose images and various aspects of human appearance. The method is able to generate human images conditioned on selected parts from different humans as control conditions, allowing you to create images with specific combinations of facial features, hair, clothes, etc.

23.04.24 · Project Page · Code · Personalized Image Generation · Controllable Image Generation

Training-and-prompt-free General Painterly Harmonization Using Image-wise Attention Sharing

TF-GPH can blend images with disparate visual elements together stylistically!

19.04.24 · Code · Image Editing

Customizing Text-to-Image Diffusion with Camera Viewpoint Control

CustomDiffusion360 brings camera viewpoint control to text-to-image models. Only caveat: it requires a 360 degree multi-view dataset of around 50 images per object to work.

18.04.24 · Project Page · Code · Text-to-Image · Image Editing

StyleBooth

StyleBooth is a unified style editing method supporting text-based, exemplar-based and compositional style editing. So basically, you can take an image and change its style by either giving it a text prompt or an example image.

18.04.24 · Project Page · Code · Image Editing · Image Style Transfer

IntrinsicAnything

IntrinsicAnything is able to recover object materials from any images and enable single-view image relighting.

17.04.24 · Project Page · Code · Image Segmentation

VQ-Diffusion

VQ-Diffusion can generate high-quality images from text prompts using a vector quantized variational autoencoder and a conditional denoising diffusion model. It is up to fifteen times faster than traditional methods and handles complex scenes effectively.

17.04.24 · Code · Demo · Text-to-Image

MOWA

MOWA is a multiple-in-one image warping model that can be used for various tasks such as rectangling panoramic images, unrolling shutter images, rotating images, fisheye images, and image retargeting.

16.04.24 · Project Page · Code · Image-to-Image · Image Editing

ControlNet++

[ControlNet++] can improve image generation by ensuring that generated images match the given controls, like segmentation masks and depth maps. It shows better performance than its predecessor, ControlNet, with improvements of 7.9% in mIoU, 13.4% in SSIM, and 7.6% in RMSE.

11.04.24 · Project Page · Code · Text-to-Image · Image-to-Image

Taming Stable Diffusion for Text to 360° Panorama Image Generation

PanFusion can generate 360-degree panorama images from a text prompt. The model is able to integrate additional constraints like room layout for customized panorama outputs.

11.04.24 · Project Page · Code · Text-to-Image

MindBridge

MindBridge can reconstruct images from fMRI brain signals using a single model that works for different people. It achieves high accuracy even with limited data, making it effective for new subjects.

11.04.24 · Project Page · Code · Brain-to-Image

GoodDrag

GoodDrag can improve the stability and image quality of drag editing with diffusion models. It reduces distortions by alternating between drag and denoising operations and introduces a new dataset, Drag100, for better quality assessment.

10.04.24 · Project Page · Code · Image Editing · Image Restoration

ZeST

ZeST can change the material of an object in an image to match a material example image. It can also perform multiple material edits in a single image and perform implicit lighting-aware edits on the rendering of a textured mesh.

09.04.24 · Project Page · Code · Image Inpainting · Image Editing

Automatic Controllable Colorization via Imagination

Imagine Colorization leverages pre-trained diffusion models to colorize images while supporting controllable and user-interactive capabilities.

08.04.24 · Project Page · Code · Image Colorization · Image Editing

Identity Decoupling for Multi-Subject Personalization of Text-to-Image Models

MuDI can generate high-quality images of multiple subjects without mixing their identities. It has a 2x higher success rate for personalizing images and is preferred by over 70% of users in evaluations.

05.04.24 · Project Page · Code · Text-to-Image

Physical Property Understanding from Language-Embedded Feature Fields

NeRF2Physics can predict the physical properties (mass, friction, hardness, thermal conductivity and Young’s modulus) of objects from a collection of images. This makes it possible to simulate the physical behavior of digital twins in a 3D scene.

05.04.24 · Project Page · Code · Image Object Detection

LCM-Lookahead for Encoder-based Text-to-Image Personalization

LCM-Lookahead is another attempted LoRA killer with an LCM-based approach for identity transfer in text-to-image generations.

04.04.24 · Project Page · Code · Text-to-Image Personalized Image Generation