AI Toolbox

Image AI Tools

Free image AI tools for generating and editing visuals, creating 3D assets for games, films, and more, optimizing your creative projects.

Image AI Tools

3D Editing 3D Object Generation 3D Scene Generation Brain-to-Image Controllable Image Generation Image Captioning Image Classification Image Colorization Image Depth Estimation Image Editing Image Editing Controllable Image Generation Image Style Transfer Image Generation Image Inpainting Image Inpainting Image Editing Image Object Detection Image Relighting Image Restoration Image Segmentation Image Style Transfer Image-to-3D Image-to-Depth Image-to-Image Image-to-Sketch Image-to-Text Image-to-Video Image Upscaling Personalized Image Generation Text-to-Image Text-to-Image Personalized Image Generation Video Captioning Video Editing Virtual Image Try-On

One-Shot Diffusion Mimicker for Handwritten Text Generation

One-DM can generate handwritten text from a single reference sample, mimicking the style of the input. It captures unique writing patterns and works well across multiple languages.

09.09.24 · Code · Text-to-Image

Distilling Diffusion Models into Conditional GANs

Diffusion2GAN is a method to distill a complex multistep diffusion model into a single-step conditional GAN student model, dramatically accelerating inference while preserving image quality. This enables one-step 512px/1024px image generation at an interactive speed of 0.09/0.16 second as well as 4k image upscaling!

05.09.24 · Project Page · Code · Image-to-Image

LinFusion

LinFusion can generate high-resolution images up to 16K in just one minute using a single GPU. It improves performance on various Stable Diffusion versions and works with pre-trained components like ControlNet and IP-Adapter.

04.09.24 · Project Page · Code · Demo · Controllable Image Generation

CSGO

CSGO can perform image-driven style transfer and text-driven stylized synthesis. It uses a large dataset with 210k image triplets to improve style control in image generation.

03.09.24 · Project Page · Code · Image Style Transfer · Text-to-Image · Image Editing

Thin-Plate Spline-based Interpolation for Animation Line Inbetweening

tps-inbetween can generate high-quality intermediate frames for animation line art. It effectively connects lines and fills in missing details, even during fast movements, using a method that models keypoint relationships between frames.

25.08.24 · Code · Image-to-Image · Image Inpainting

Iterative Object Count Optimization for Text-to-image Diffusion Models

Iterative Object Count Optimization can improve object counting accuracy in text-to-image diffusion models.

18.08.24 · Project Page · Code · Text-to-Image

MagicFace

MagicFace can generate high-quality images of people in any style without needing extra training.

15.08.24 · Project Page · Code · Personalized Image Generation · Image Editing

MagicFace

MagicFace can generate high-quality images of people in any style without needing training. It uses special attention methods for precise attribute alignment and feature injection, working for both single and multi-concept customization.

15.08.24 · Project Page · Code · Personalized Image Generation · Image Editing

Generative Photomontage

Generative Photomontage can combine parts of multiple AI-generated images using a brush tool. It enables the creation of new appearance combinations, correct shapes and artifacts, and improve prompt alignment, outperforming existing image blending methods.

15.08.24 · Project Page · Code · Image Editing · Image Inpainting

Filtered-Guided Diffusion

Filtered Guided Diffusion shows that image-to-image translation and editing doesn’t necessarily require additional training. FGD simply applies a filter to the input of each diffusion step based on the output of the previous step in an adaptive manner which makes this approach easy to implement.

14.08.24 · Code · Image-to-Image · Image Editing

Matryoshka Diffusion Models

[Matryoshka Diffusion Models] can generate high-quality images and videos using a NestedUNet architecture that denoises inputs at different resolutions. This method allows for strong performance at resolutions up to 1024x1024 pixels and supports effective training without needing specific examples.

14.08.24 · Project Page · Code · Text-to-Video · Text-to-Image

Fast Sprite Decomposition from Animated Graphics

Sprite-Decompose can break down animated graphics into sprites using videos and box outlines.

07.08.24 · Project Page · Code · Image Segmentation

MILS

MILS can generate captions for images, videos, and audio without any training. It achieves top performance in zero-shot captioning and improves text-to-image generation, allowing for creative uses across different media types.

06.08.24 · Code · Image Captioning · Video Captioning · Audio Captioning · Text-to-Image · Image-to-Image

IPAdapter-Instruct

IPAdapter-Instruct can efficiently combine natural-image conditioning with “Instruct” prompts! It enables users to switch between various interpretations of the same image, such as style transfer and object extraction.

06.08.24 · Project Page · Code · Image Style Transfer · Image Editing

Lumina-mGPT

Lumina-mGPT can create photorealistic images from text and handle different visual and language tasks! It uses a special transformer model, making it possible to control image generation, do segmentation, estimate depth, and answer visual questions in multiple steps.

06.08.24 · Code · Demo · Text-to-Image

VAR-CLIP

VAR-CLIP creates detailed fantasy images that match text descriptions closely by combining Visual Auto-Regressive techniques with CLIP! It uses text embeddings to guide image creation, ensuring strong results by training on a large image-text dataset.

05.08.24 · Code · Text-to-Image

Smoothed Energy Guidance

SEG improves image generation for SDXL by smoothing the self-attention energy landscape! This boosts quality without needing guidance scale, using a query blurring method that adjusts attention weights, leading to better results with fewer drawbacks.

02.08.24 · Code · Controllable Image Generation

DreamMover

DreamMover can generate high-quality intermediate images and short videos from image pairs with large motion. It uses a flow estimator based on diffusion models to keep details and ensure consistency between frames and input images.

31.07.24 · Project Page · Code · Image-to-Image

Magic Clothing

Magic Clothing can generate customized characters wearing specific garments from diverse text prompts while preserving the details of the target garments and maintain faithfulness to the text prompts.

29.07.24 · Code · Text-to-Image · Image Editing

ViPer

ViPer can personalize image generation by capturing individual user preferences through a one-time commenting process on a selection of images. It utilizes these preferences to guide a text-to-image model, resulting in generated images that align closely with users’ visual tastes.

25.07.24 · Project Page · Code · Personalized Image Generation