Image AI Tools
Free image AI tools for generating and editing visuals, creating 3D assets for games, films, and more, optimizing your creative projects.
Entity-Level Text-Guided Image Manipulation can edit specific parts of an image based on text descriptions while keeping other areas unchanged. It uses a two-step process for aligning meanings and making changes, allowing for flexible and precise editing.
[Tool Name] can [main function/capability]. It [key detail 1] and [key detail 2].
MultiDiffusion can generate high-quality images using a pre-trained text-to-image diffusion model. It allows users to control aspects like image size and includes features for guiding images with segmentation masks and bounding boxes.
ControlNet can add control to text-to-image diffusion models. It lets users manipulate image generation using methods like edge detection and depth maps, while working well with both small and large datasets.
Neural Congealing can align similar content across multiple images using a self-supervised method. It uses pre-trained DINO-ViT features to create a shared semantic map, allowing for effective alignment even with different appearances and backgrounds.
Pix2Pix-Zero can edit images by changing them in real-time, like turning a cat into a dog, without needing extra text prompts or training. It keeps the original image’s structure and uses pre-trained text-to-image diffusion models for better editing results.
StyleGAN-T can generate high-quality images at 512x512 resolution in just 2 seconds using a single NVIDIA A100 GPU. It solves problems in text-to-image synthesis, like stable training on diverse datasets and strong text alignment.
CLIPascene can convert scene images into sketches with different levels of detail and simplicity. Users can create a range of sketches, from detailed to simple, allowing for personalized artistic expression.
VectorFusion can generate SVG-exportable vector graphics from text prompts. It uses a text-conditioned diffusion model to create high-quality outputs in various styles, like pixel art and sketches, without needing large datasets of captioned SVGs.
InstructPix2Pix can edit images based on written instructions. It allows users to add or remove objects, change colors, and transform styles quickly, using a conditional diffusion model trained on a large dataset.
MinD-Vis can create realistic images from brain recordings using a method that combines Sparse Masked Brain Modeling and a Double-Conditioned Latent Diffusion Model. It achieves top performance in understanding thoughts and generating images, surpassing previous results by 66% in semantic mapping and 41% in image quality, while needing very few paired examples.
UnZipLoRA can break down an image into its subject and style. This makes it possible to create variations and apply styles to new subjects.
SDEdit can generate and edit photo-realistic images using user-guided inputs like hand-drawn strokes or text prompts. It outperforms GAN-based methods, achieving high scores in realism and overall satisfaction without needing specific training.
[Bridging High-Quality Audio and Video via Language for Sound Effects Retrieval from Visual Queries] can retrieve high-quality sound effects from a single video frame without needing text metadata. It uses a combination of large language models and contrastive learning to match sound effects to video better than existing methods.
GFPGAN can restore realistic facial details from low-quality images using a pretrained face GAN. It works well on both synthetic and real-world images, allowing for quick restoration with just one pass, unlike older methods.