Image AI Tools
Free image AI tools for generating and editing visuals, creating 3D assets for games, films, and more, optimizing your creative projects.
PIXART-α can generate high-quality images at a resolution of up to 1024px. It reduces training time to 10.8% of Stable Diffusion v1.5, costing about $26,000 and emitting 90% less CO2.
AnimeInbet is a method that is able to generate inbetween frames for cartoon line drawings. Seeing this, we’ll hopefully be blessed with higher framerate animes in the near future.
PGDiff can restore and colorize faces from low-quality images by using details from high-quality images. It effectively fixes issues like scratches and blurriness.
InstaFlow can generate high-quality images in just one step, achieving an FID of 23.3 on MS COCO 2017-5k. It works very fast at about 0.09 seconds per image, using much less computing power than traditional diffusion models.
SyncDreamer is able to generate multiview-consistent images from a single-view image and thus is able to generate 3D models from 2D designs and hand drawings. It wasn’t able to help me in my quest to turn my PFP into a 3D avatar, but someday I’ll get there!
[Total Selfie] can generate high-quality full-body selfies from close-up selfies and background images. It uses a diffusion-based approach to combine these inputs, creating realistic images in desired poses and overcoming the limits of traditional selfies.
Scenimefy can turn real-world images and videos into high-quality anime scenes. It uses a smart method that keeps important details and produces better results than other tools.
CLE Diffusion can enhance low-light images by letting users control brightness levels and choose specific areas for improvement. It uses an illumination embedding and the Segment-Anything Model (SAM) for precise and natural-looking enhancements.
Similar to ControlNet and Composer, IP-Adapter is a mutli-modal guidance adapter for image prompts which works with Stable Diffusion models trained on the same base model. The results look amazing.
RIP expensive low-light cameras? It’s amazing how AI is able to solve problems which so far was only possible with better hardware. In this example the novel LED model is able to denoise low-light images trained on only 6 pairs of images. The results are impressive, but the team is not done yet. They’re currently researching a method that works on a wide variety of scenarios trained on only 2 pairs.
DWPose is a post estimator that uses a two-stage distillation approach to improve the accuracy of the pose estimation.
Interpolating between Images with Diffusion Models can generate smooth transitions between two images using latent diffusion models. It allows for high-quality results across different styles and subjects while using CLIP to select the best images for interpolation.
FABRIC can condition diffusion models on feedback images to improve image quality. This method allows users to personalize content through multiple feedback rounds without needing training.
AnimateDiff is a new framework that brings video generation to the Stable Diffusion pipeline. Meaning you can generate videos with any already existing Stable Diffusion models without having to fine-tune or train anything. Pretty amazing. @DigThatData put together a Google Colab notebook in case you want to give it a try.
Text2Cinemagraph can create cinemagraphs from text descriptions, animating elements like flowing rivers and drifting clouds. It combines artistic images with realistic ones to accurately show motion, outperforming other methods in generating cinemagraphs for natural and artistic scenes.
CSD-Edit is a multi modality editing approach that compared to other methods works great on images bigger than the traditional 512x512 limitation and can edit 4k or large panorama images, has improved temporal consistency on video frames as well as improved view consistency when editing or generating 3D scenes.
DreamDiffusion can generate high-quality images from brain EEG signals without needing to translate thoughts into text. It uses pre-trained text-to-image models and special techniques to handle noise and individual differences, making it a key step towards affordable thoughts-to-image technology.
DiffSketcher is a tool that can turn words into vectorized free-hand sketches. The method also supports the ability to define the level of abstraction, allowing for more abstract or concrete generations.
Diffusion with Forward Models is a able to reconstruct 3D scenes from a single input image. Additionally it’s also able to add small and short motions to images with people in them.
Cocktail is a pipeline for guiding image generating. Compared to ControlNet, it only requires one generalized model for multiple modalities like Edge, Pose and Mask guidance.