Image AI Tools
Free image AI tools for generating and editing visuals, creating 3D assets for games, films, and more, optimizing your creative projects.
MV-Adapter can generate images from multiple views while keeping them consistent across views. It enhances text-to-image models like Stable Diffusion XL, supporting both text and image inputs, and achieves high-resolution outputs at 768x768.
Anagram-MTL can generate visual anagrams that change appearance with transformations like flipping or rotating.
Negative Token Merging can improve image diversity by pushing apart similar features during the reverse diffusion process. It reduces visual similarity with copyrighted content by 34.57% and works well with Stable Diffusion as well as Flux.
FlowEdit can edit images using only text prompts with Flux and Stable Diffusion 3.
You ever tried to inpaint smaller objects and details into an image? Can be kind of a hit or miss. SOEDiff has been specifically trained to handle these cases and can do a pretty good job at it.
MegaFusion can extend existing diffusion models for high-resolution image generation. It achieves images up to 2048x2048 with only 40% of the original computational cost by enhancing denoising processes across different resolutions.
DreamMix is a inpainting method based on the Fooocus model that can add objects from reference images and change their features using text.
Omegance can control detail levels in diffusion-based synthesis using a single parameter, ω. It allows for precise granularity control in generated outputs and enables specific adjustments through spatial masks and denoising schedules.
FlipSketch can generate sketch animations from static drawings by allowing users to describe the desired motion. It uses motion priors from text-to-video diffusion models to create smooth animations while keeping the original sketch’s look.
StyleCodes can encode the style of an image into a 20-symbol base64 code for easy sharing and use in image generation. It allows users to create style-reference codes (srefs) from their own images, helping to control styles in diffusion models with high quality.
SGEdit can add, remove, replace, and adjust objects in images while keeping the quality of the image consistent.
MagicQuill enables efficient image editing with a simple interface that lets users easily insert elements and change colors. It uses a large language model to understand editing intentions in real time, improving the quality of the results.
RayGauss can create realistic new views of 3D scenes, using Gaussian-based ray casting! It produces high-quality images quickly, running at 25 frames per second, and avoids common picture problems that older methods had.
Regional-Prompting-FLUX adds regional prompting capabilities to diffusion transformers like FLUX. It effectively manages complex prompts and works well with tools like LoRA and ControlNet.
ZIM can generate precise matte masks from segmentation labels, enabling zero-shot image matting.
Face Anon can anonymize faces in images while keeping original facial expressions and head positions. It uses diffusion models to achieve high-quality image results and can also perform face swapping tasks.
ScalingConcept can enhance or suppress existing concepts in images and audio without adding new elements. It can generate poses, enhance object stitching and reduce fuzziness in anime productions.
ControlAR adds controls like edges, depths, and segmentation masks to autoregressive models like LlamaGen.
State of the art diffusion models are trained on square images. FiT is a new transformer architecture specifically designed for generating images with unrestricted resolutions and aspect ratios (similar to what Sora does). This enables a flexible training strategy that effortlessly adapts to diverse aspect ratios during both training and inference phases, thus promoting resolution generalization and eliminating biases induced by image cropping.
From Text to Pose to Image can generate high-quality images from text prompts by first creating poses and then using them to guide image generation. This method improves control over human poses and enhances image fidelity in diffusion models.