AI Toolbox

Image AI Tools

Free image AI tools for generating and editing visuals, creating 3D assets for games, films, and more, optimizing your creative projects.

Image AI Tools

3D Editing 3D Object Generation 3D Scene Generation Brain-to-Image Controllable Image Generation Image Captioning Image Classification Image Colorization Image Depth Estimation Image Editing Image Editing Controllable Image Generation Image Style Transfer Image Generation Image Inpainting Image Inpainting Image Editing Image Object Detection Image Relighting Image Restoration Image Segmentation Image Style Transfer Image-to-3D Image-to-Depth Image-to-Image Image-to-Sketch Image-to-Text Image-to-Video Image Upscaling Personalized Image Generation Text-to-Image Text-to-Image Personalized Image Generation Video Captioning Video Editing Virtual Image Try-On

Reduce, Reuse, Recycle

Reduce, Reuse, Recycle can enable compositional generation using energy-based diffusion models and MCMC samplers. It improves tasks like classifier-guided ImageNet modeling and text-to-image generation by introducing new samplers that enhance performance.

22.02.23 · Project Page · Code · Text-to-Image · Image Classification

Entity-Level Text-Guided Image Manipulation

Entity-Level Text-Guided Image Manipulation can edit specific parts of an image based on text descriptions while keeping other areas unchanged. It uses a two-step process for aligning meanings and making changes, allowing for flexible and precise editing.

22.02.23 · Project Page · Code · Image Editing

T2I-Adapter

[Tool Name] can [main function/capability]. It [key detail 1] and [key detail 2].

16.02.23 · Code · Demo · Text-to-Image · Image Editing · Controllable Image Generation

MultiDiffusion

MultiDiffusion can generate high-quality images using a pre-trained text-to-image diffusion model. It allows users to control aspects like image size and includes features for guiding images with segmentation masks and bounding boxes.

16.02.23 · Project Page · Code · Demo · Controllable Image Generation · Image Segmentation

Adding Conditional Control to Text-to-Image Diffusion Models

ControlNet can add control to text-to-image diffusion models. It lets users manipulate image generation using methods like edge detection and depth maps, while working well with both small and large datasets.

10.02.23 · Code · Text-to-Image · Controllable Image Generation

Neural Congealing

Neural Congealing can align similar content across multiple images using a self-supervised method. It uses pre-trained DINO-ViT features to create a shared semantic map, allowing for effective alignment even with different appearances and backgrounds.

08.02.23 · Project Page · Code · Image Segmentation · Image Editing

Zero-shot Image-to-Image Translation

Pix2Pix-Zero can edit images by changing them in real-time, like turning a cat into a dog, without needing extra text prompts or training. It keeps the original image’s structure and uses pre-trained text-to-image diffusion models for better editing results.

06.02.23 · Project Page · Code · Image-to-Image · Image Editing

StyleGAN-T

StyleGAN-T can generate high-quality images at 512x512 resolution in just 2 seconds using a single NVIDIA A100 GPU. It solves problems in text-to-image synthesis, like stable training on diverse datasets and strong text alignment.

23.01.23 · Project Page · Code · Text-to-Image · Controllable Image Generation

CLIPascene

CLIPascene can convert scene images into sketches with different levels of detail and simplicity. Users can create a range of sketches, from detailed to simple, allowing for personalized artistic expression.

30.11.22 · Project Page · Code · Image-to-Sketch

VectorFusion

VectorFusion can generate SVG-exportable vector graphics from text prompts. It uses a text-conditioned diffusion model to create high-quality outputs in various styles, like pixel art and sketches, without needing large datasets of captioned SVGs.

21.11.22 · Project Page · Code · Text-to-Image

InstructPix2Pix

InstructPix2Pix can edit images based on written instructions. It allows users to add or remove objects, change colors, and transform styles quickly, using a conditional diffusion model trained on a large dataset.

17.11.22 · Project Page · Code · Image Editing

Seeing Beyond the Brain

MinD-Vis can create realistic images from brain recordings using a method that combines Sparse Masked Brain Modeling and a Double-Conditioned Latent Diffusion Model. It achieves top performance in understanding thoughts and generating images, surpassing previous results by 66% in semantic mapping and 41% in image quality, while needing very few paired examples.

13.11.22 · Project Page · Code · Brain-to-Image

UnZipLoRA

UnZipLoRA can break down an image into its subject and style. This makes it possible to create variations and apply styles to new subjects.

15.02.22 · Project Page · Image Editing · Image Style Transfer

SDEdit

SDEdit can generate and edit photo-realistic images using user-guided inputs like hand-drawn strokes or text prompts. It outperforms GAN-based methods, achieving high scores in realism and overall satisfaction without needing specific training.

02.08.21 · Project Page · Code · Image-to-Image · Image Editing

Learning Transferable Visual Models From Natural Language Supervision

[Bridging High-Quality Audio and Video via Language for Sound Effects Retrieval from Visual Queries] can retrieve high-quality sound effects from a single video frame without needing text metadata. It uses a combination of large language models and contrastive learning to match sound effects to video better than existing methods.

26.02.21 · Project Page · Code · Image Classification · Image Segmentation

Towards Real-World Blind Face Restoration with Generative Facial Prior

GFPGAN can restore realistic facial details from low-quality images using a pretrained face GAN. It works well on both synthetic and real-world images, allowing for quick restoration with just one pass, unlike older methods.

11.01.21 · Code · Code · Image Restoration · Image Colorization