Image Editing
Free image editing AI tools for quickly enhancing photos, creating visuals, and manipulating images for projects in art, marketing, and design.
OHTA can create detailed and usable hand avatars from just one image. It allows for text-to-avatar conversion and editing of hand textures and shapes, using data-driven hand priors to improve accuracy with limited input.
Multi-LoRA Composition focuses on the integration of multiple Low-Rank Adaptations (LoRAs) to create highly customized and detailed images. The approach is able to generate images with multiple elements without fine-tuning and without losing detail or image quality.
Continuous 3D Words is a control method that can modify attributes in images with a slider based approach. This allows for more control over illumination, non-rigid shape changes (like wings), and camera orientation for instance.
SEELE can move around objects within an image. It does so by removing it, inpainting occluded portions and harmonizing the appearance of the repositioned object with the surrounding areas.
StableIdentity is a method that can generate diverse customized images in various contexts from a single input image. The cool thing about this method is, that it is able to combine the learned identity with ControlNet and even inject it into video (ModelScope) and 3D (LucidDreamer) generation.
InstantID is a ID embedding-based method that can be used to personalize images in various styles using just a single facial image, while ensuring high fidelity.
ControlNet-XS can control text-to-image diffusion models like Stable Diffusion and Stable Diffusion-XL with only 1% of the parameters of the base model. It is about twice as fast as ControlNet and produces higher quality images with better control.
AmbiGen can generate ambigrams by optimizing letter shapes for clear reading from two angles. It improves word accuracy by over 11.6% and reduces edit distance by 41.9% on the 500 most common English words.
Readout Guidance can control text-to-image diffusion models using lightweight networks called readout heads. It enables pose, depth, and edge-guided generation with fewer parameters and training samples, allowing for easier manipulation and consistent identity generation.
Material Palette can extract a palette of PBR materials (albedo, normals, and roughness) from a single real-world image. Looks very useful for creating new materials for 3D scenes or even for generating textures for 2D art.
It’s been a while since I last doomed the TikTok dancers. MagicDance is gonna doom them some more. This model can combine human motion with reference images to precisely generate appearance-consistent videos. While the results still contain visible artifacts and jittering, give it a few months and I’m sure we can’t tell the difference no more.
Object-aware Inversion and Reassembly can edit multiple objects in an image by finding the best steps for each edit. It allows for precise changes in shapes, colors, and materials while keeping the rest of the image intact.
HyperHuman is a text-to-image model that focuses on generating hyper-realistic human images from text prompts and a pose image. The results are pretty impressive and the model is able to generate images in different styles and up to a resolution of 1024x1024.
[Total Selfie] can generate high-quality full-body selfies from close-up selfies and background images. It uses a diffusion-based approach to combine these inputs, creating realistic images in desired poses and overcoming the limits of traditional selfies.
CLE Diffusion can enhance low-light images by letting users control brightness levels and choose specific areas for improvement. It uses an illumination embedding and the Segment-Anything Model (SAM) for precise and natural-looking enhancements.
Interpolating between Images with Diffusion Models can generate smooth transitions between two images using latent diffusion models. It allows for high-quality results across different styles and subjects while using CLIP to select the best images for interpolation.
FABRIC can condition diffusion models on feedback images to improve image quality. This method allows users to personalize content through multiple feedback rounds without needing training.
Text2Cinemagraph can create cinemagraphs from text descriptions, animating elements like flowing rivers and drifting clouds. It combines artistic images with realistic ones to accurately show motion, outperforming other methods in generating cinemagraphs for natural and artistic scenes.
CSD-Edit is a multi modality editing approach that compared to other methods works great on images bigger than the traditional 512x512 limitation and can edit 4k or large panorama images, has improved temporal consistency on video frames as well as improved view consistency when editing or generating 3D scenes.
Break-A-Scene can extract multiple concepts from a single image using segmentation masks. It allows users to re-synthesize individual concepts or combinations in different contexts, enhancing scene generation with a two-phase customization process.