Image AI Tools
Free image AI tools for generating and editing visuals, creating 3D assets for games, films, and more, optimizing your creative projects.
IntrinsicAnything is able to recover object materials from any images and enable single-view image relighting.
VQ-Diffusion can generate high-quality images from text prompts using a vector quantized variational autoencoder and a conditional denoising diffusion model. It is up to fifteen times faster than traditional methods and handles complex scenes effectively.
MOWA is a multiple-in-one image warping model that can be used for various tasks such as rectangling panoramic images, unrolling shutter images, rotating images, fisheye images, and image retargeting.
[ControlNet++] can improve image generation by ensuring that generated images match the given controls, like segmentation masks and depth maps. It shows better performance than its predecessor, ControlNet, with improvements of 7.9% in mIoU, 13.4% in SSIM, and 7.6% in RMSE.
PanFusion can generate 360-degree panorama images from a text prompt. The model is able to integrate additional constraints like room layout for customized panorama outputs.
MindBridge can reconstruct images from fMRI brain signals using a single model that works for different people. It achieves high accuracy even with limited data, making it effective for new subjects.
GoodDrag can improve the stability and image quality of drag editing with diffusion models. It reduces distortions by alternating between drag and denoising operations and introduces a new dataset, Drag100, for better quality assessment.
ZeST can change the material of an object in an image to match a material example image. It can also perform multiple material edits in a single image and perform implicit lighting-aware edits on the rendering of a textured mesh.
Imagine Colorization leverages pre-trained diffusion models to colorize images while supporting controllable and user-interactive capabilities.
MuDI can generate high-quality images of multiple subjects without mixing their identities. It has a 2x higher success rate for personalizing images and is preferred by over 70% of users in evaluations.
NeRF2Physics can predict the physical properties (mass, friction, hardness, thermal conductivity and Young’s modulus) of objects from a collection of images. This makes it possible to simulate the physical behavior of digital twins in a 3D scene.
LCM-Lookahead is another attempted LoRA killer with an LCM-based approach for identity transfer in text-to-image generations.
InstantStyle can separate style and content from images in text-to-image generation without tuning. It improves visual style by using features from reference images while keeping text control and preventing style leaks.
CosmicMan can generate high-quality, photo-realistic human images that match text descriptions closely. It uses a unique method called Annotate Anyone and a training framework called Decomposed-Attention-Refocusing (Daring) to improve the connection between text and images.
Following spatial instructions in text-to-image prompts is hard! SPRIGHT-T2I can finally do it though, resulting in more coherent and accurate compositions.
ID2Reflectance can generate high-quality facial reflectance maps from a single image.
Learning Inclusion Matching for Animation Paint Bucket Colorization can colorize line art in animations by allowing artists to colorize just one frame. The algorithm then automatically applies the color to the rest of the frames, using a learning-based inclusion matching pipeline for more accurate results.
PAID is a method that enables smooth high consistency image interpolation for diffusion models. GANs have been the king in that field so far, but this method shows promising results for diffusion models.
Attribute Control enables fine-grained control over attributes of specific subjects in text-to-image models. This lets you modify attributes like age, width, makeup, smile and more for each subject independently.
FlashFace can personalize photos by using one or a few reference face images and a text prompt. It keeps important details like scars and tattoos while balancing text and image guidance, making it useful for face swapping and turning virtual characters into real people.