Image AI Tools
Free image AI tools for generating and editing visuals, creating 3D assets for games, films, and more, optimizing your creative projects.
Exploiting Diffusion Prior for Real-World Image Super-Resolution can restore high-quality images from low-resolution inputs using pre-trained text-to-image diffusion models. It allows users to balance image quality and fidelity through a controllable feature wrapping module and adapts to different image resolutions with a progressive aggregation sampling strategy.
MagicMan can generate high-quality 3D images and normal maps of humans from a single photo.
TurboEdit enables fast text-based image editing in just 3-4 diffusion steps! It improves edit quality and preserves the original image by using a shifted noise schedule and a pseudo-guidance approach, tackling issues like visual artifacts and weak edits.
TextBoost can enable one-shot personalization of text-to-image models by fine-tuning the text encoder. It generates diverse images from a single reference image while reducing overfitting and memory needs.
NeuroPictor can improve fMRI-to-image reconstruction by using fMRI signals to control diffusion models. It is trained on over 67,000 fMRI-image pairs, allowing for better accuracy in generating images that reflect both high-level concepts and fine details.
Text2Place can place any human or object realistically into diverse backgrounds. This enables scene hallucination by generating compatible scenes for the given pose of the human, text-based editing of the human and placing multiple persons into a scene.
One-DM can generate handwritten text from a single reference sample, mimicking the style of the input. It captures unique writing patterns and works well across multiple languages.
Diffusion2GAN is a method to distill a complex multistep diffusion model into a single-step conditional GAN student model, dramatically accelerating inference while preserving image quality. This enables one-step 512px/1024px image generation at an interactive speed of 0.09/0.16 second as well as 4k image upscaling!
LinFusion can generate high-resolution images up to 16K in just one minute using a single GPU. It improves performance on various Stable Diffusion versions and works with pre-trained components like ControlNet and IP-Adapter.
CSGO can perform image-driven style transfer and text-driven stylized synthesis. It uses a large dataset with 210k image triplets to improve style control in image generation.
tps-inbetween can generate high-quality intermediate frames for animation line art. It effectively connects lines and fills in missing details, even during fast movements, using a method that models keypoint relationships between frames.
Iterative Object Count Optimization can improve object counting accuracy in text-to-image diffusion models.
MagicFace can generate high-quality images of people in any style without needing extra training.
MagicFace can generate high-quality images of people in any style without needing training. It uses special attention methods for precise attribute alignment and feature injection, working for both single and multi-concept customization.
Generative Photomontage can combine parts of multiple AI-generated images using a brush tool. It enables the creation of new appearance combinations, correct shapes and artifacts, and improve prompt alignment, outperforming existing image blending methods.
Filtered Guided Diffusion shows that image-to-image translation and editing doesn’t necessarily require additional training. FGD simply applies a filter to the input of each diffusion step based on the output of the previous step in an adaptive manner which makes this approach easy to implement.
[Matryoshka Diffusion Models] can generate high-quality images and videos using a NestedUNet architecture that denoises inputs at different resolutions. This method allows for strong performance at resolutions up to 1024x1024 pixels and supports effective training without needing specific examples.
Sprite-Decompose can break down animated graphics into sprites using videos and box outlines.
IPAdapter-Instruct can efficiently combine natural-image conditioning with “Instruct” prompts! It enables users to switch between various interpretations of the same image, such as style transfer and object extraction.
Lumina-mGPT can create photorealistic images from text and handle different visual and language tasks! It uses a special transformer model, making it possible to control image generation, do segmentation, estimate depth, and answer visual questions in multiple steps.