Controllable Image Generation
Free controllable image generation AI tools for creating customizable visuals, helping artists and designers produce tailored images for projects.
ByteDance published a new low-step method called PeRFlow which accelerates diffusion models like Stable Diffusion to generate images faster. PeRFlow is compatible with various fine-tuned stylized SD models as well as SD-based generation/editing pipelines such as ControlNet, Wonder3D and more.
XVerse can create high-quality images with multiple subjects that can be edited. It allows precise control over each subject’s pose, style, and lighting, while also reducing issues like attribute entanglement and artifacts.
PreciseCam can generate images with exact control over camera angles and lens distortions using four simple camera settings.
ID-Patch can generate personalized group photos by matching faces with specific positions. It reduces problems like identity leakage and visual errors, achieving high accuracy and speed—seven times faster than other methods.
IMAGGarment-1 can generate high-quality garments with control over shape, color, and logo placement.
IP-Composer can generate compositional images by using multiple input images and natural language prompts.
UNO that brings subject transfer and preservation from reference image to FLUX with one single model.
DreamRenderer extends FLUX with image content control using bounding boxes or masks.
UniCon can handle different image generation tasks using a single framework. It adapts a pretrained image diffusion model with only about 15% extra parameters and supports most base ControlNet transformations.
Factor Graph Diffusion can generate high-quality images with better prompt adherence. The method allows for controllable image creation using tools like segmentation and depth maps.
Negative Token Merging can improve image diversity by pushing apart similar features during the reverse diffusion process. It reduces visual similarity with copyrighted content by 34.57% and works well with Stable Diffusion as well as Flux.
StyleCodes can encode the style of an image into a 20-symbol base64 code for easy sharing and use in image generation. It allows users to create style-reference codes (srefs) from their own images, helping to control styles in diffusion models with high quality.
ControlAR adds controls like edges, depths, and segmentation masks to autoregressive models like LlamaGen.
CtrLoRA can adapt a base ControlNet for image generation with just 1,000 data pairs in under one hour of training on a single GPU. It reduces learnable parameters by 90%, making it much easier to create new guidance conditions.
LinFusion can generate high-resolution images up to 16K in just one minute using a single GPU. It improves performance on various Stable Diffusion versions and works with pre-trained components like ControlNet and IP-Adapter.
SEG improves image generation for SDXL by smoothing the self-attention energy landscape! This boosts quality without needing guidance scale, using a query blurring method that adjusts attention weights, leading to better results with fewer drawbacks.
Artist stylizes images based on text prompts, preserving the original content while producing high aesthetic quality results. No finetuning, no ControlNets, it just works with your pretrained StableDiffusion model.
AccDiffusion can generate high-resolution images with fewer object repetition! Something Stable Diffusion has been plagued by since its infancy.
PartCraft can generate customized and photorealistic virtual creatures by mixing visual parts from existing images. This tool allows users to create unique hybrids and make detailed changes, which is useful for digital asset creation and studying biodiversity.
DMD2 is a new improved distillation method that can turn diffusion models into efficient one-step image generators.