Image-to-Image
Free image-to-image AI tools for transforming visuals, perfect for artists needing to modify photos, create variations, and explore new designs.
FitDiT can generate realistic virtual try-on images that show how clothes fit on different body types. It keeps garment textures clear and works quickly, taking only 4.57 seconds for a single image.
TryOffAnyone can generate high-quality images of clothing on models from photos.
MV-Adapter can generate images from multiple views while keeping them consistent across views. It enhances text-to-image models like Stable Diffusion XL, supporting both text and image inputs, and achieves high-resolution outputs at 768x768.
RayGauss can create realistic new views of 3D scenes, using Gaussian-based ray casting! It produces high-quality images quickly, running at 25 frames per second, and avoids common picture problems that older methods had.
ScalingConcept can enhance or suppress existing concepts in images and audio without adding new elements. It can generate poses, enhance object stitching and reduce fuzziness in anime productions.
Stable-Hair can robustly transfer a diverse range of real-world hairstyles onto user-provided faces for virtual hair try-on. It employs a two-stage pipeline that includes a Bald Converter for hair removal and specialized modules for high-fidelity hairstyle transfer.
StoryMaker can generate a series of images with consistent characters across multiple images. It keeps the same facial features, clothing, hairstyles, and body types, allowing for cohesive storytelling.
MagicMan can generate high-quality 3D images and normal maps of humans from a single photo.
Text2Place can place any human or object realistically into diverse backgrounds. This enables scene hallucination by generating compatible scenes for the given pose of the human, text-based editing of the human and placing multiple persons into a scene.
Diffusion2GAN is a method to distill a complex multistep diffusion model into a single-step conditional GAN student model, dramatically accelerating inference while preserving image quality. This enables one-step 512px/1024px image generation at an interactive speed of 0.09/0.16 second as well as 4k image upscaling!
tps-inbetween can generate high-quality intermediate frames for animation line art. It effectively connects lines and fills in missing details, even during fast movements, using a method that models keypoint relationships between frames.
Filtered Guided Diffusion shows that image-to-image translation and editing doesn’t necessarily require additional training. FGD simply applies a filter to the input of each diffusion step based on the output of the previous step in an adaptive manner which makes this approach easy to implement.
DreamMover can generate high-quality intermediate images and short videos from image pairs with large motion. It uses a flow estimator based on diffusion models to keep details and ensure consistency between frames and input images.
HairFastGAN can transfer hairstyles from one image to another in near real-time. It handles different poses and colors well, achieving high quality in under a second on an Nvidia V100.
CharacterFactory can generate endless characters that look the same across different images and videos. It uses GANs and word embeddings from celebrity names to ensure characters stay consistent, making it easy to integrate with other models.
MOWA is a multiple-in-one image warping model that can be used for various tasks such as rectangling panoramic images, unrolling shutter images, rotating images, fisheye images, and image retargeting.
[ControlNet++] can improve image generation by ensuring that generated images match the given controls, like segmentation masks and depth maps. It shows better performance than its predecessor, ControlNet, with improvements of 7.9% in mIoU, 13.4% in SSIM, and 7.6% in RMSE.
ID2Reflectance can generate high-quality facial reflectance maps from a single image.
Desigen can generate high-quality design templates, including background images and layout elements. It uses advanced diffusion models for better control and has been tested on over 40,000 advertisement banners, achieving results similar to human designers.
Intrinsic Image Diffusion can generate detailed albedo, roughness, and metallic maps from a single indoor scene image.