AI Toolbox · Image

Image Segmentation

Free image segmentation AI tools for accurately identifying and isolating objects in images, streamlining your creative projects and visual content creation.

Image AI Tools

3D Editing 3D Object Generation 3D Scene Generation Brain-to-Image Controllable Image Generation Image Captioning Image Classification Image Colorization Image Depth Estimation Image Editing Image Editing Controllable Image Generation Image Style Transfer Image Generation Image Inpainting Image Inpainting Image Editing Image Object Detection Image Relighting Image Restoration Image Segmentation Image Style Transfer Image-to-3D Image-to-Depth Image-to-Image Image-to-Sketch Image-to-Text Image-to-Video Image Upscaling Personalized Image Generation Text-to-Image Text-to-Image Personalized Image Generation Video Captioning Video Editing Virtual Image Try-On

SDMatte

SDMatte can extract objects from images using visual prompts like points, boxes, and masks.

04.08.25 · Code · Image Segmentation

SketchSeg

SketchSeg can segment raster sketches into layers, making it easy for artists to move, copy, or delete objects.

06.07.25 · Project Page · Code · Image Segmentation

Any2AnyTryon

Any2AnyTryon can generate high-quality virtual try-on results by transferring clothes onto images as well as reconstructing garments from real-world images.

25.02.25 · Project Page · Code · Virtual Image Try-On · Image Editing · Image Segmentation

ZIM

ZIM can generate precise matte masks from segmentation labels, enabling zero-shot image matting.

05.11.24 · Project Page · Code · Image Segmentation

ControlAR

ControlAR adds controls like edges, depths, and segmentation masks to autoregressive models like LlamaGen.

31.10.24 · Code · Controllable Image Generation · Image Segmentation · Image Editing

SEMat

SEMat can improve interactive image matting! It enhances network design and training to achieve better transparency, detail, and accuracy than methods like MAM and SmartMat.

09.10.24 · Code · Image Segmentation

Text2Place

Text2Place can place any human or object realistically into diverse backgrounds. This enables scene hallucination by generating compatible scenes for the given pose of the human, text-based editing of the human and placing multiple persons into a scene.

10.09.24 · Project Page · Code · Image-to-Image · Image Segmentation

Fast Sprite Decomposition from Animated Graphics

Sprite-Decompose can break down animated graphics into sprites using videos and box outlines.

07.08.24 · Project Page · Code · Image Segmentation

Magic Fixup

Adobe’s Magic Fixup lets you edit images with a cut-and-paste approach that fixes edits automatically. Can see this being super useful for generating animation frames for tools like AnimateDiff. But it’s not clear yet if or when this hits Photoshop.

25.07.24 · Project Page · Code · Image Editing · Image Restoration · Image Segmentation

PartGLEE

PartGLEE can locate and identify objects and their parts in images. The method uses a unified framework that enables detection, segmentation, and grounding at any granularity.

02.07.24 · Project Page · Code · Image Segmentation · Image Object Detection · Image Classification

MIGC++

MIGC++ is a plug-and-play controller that enables Stable Diffusion with precise position control while ensuring the correctness of various attributes like color, shape, material, texture, and style. It can also control the number of instances and improve interaction between instances.

02.07.24 · Project Page · Code · Demo · Image Editing · Image Segmentation

MaGGIe

MaGGIe can efficiently predict high-quality human instance mattes from coarse binary masks for both image and video input. The method is able to output all instance mattes simultaneously without exploding memory and latency, making it suitable for real-time applications.

24.04.24 · Project Page · Code · Image Segmentation · Image Restoration

IntrinsicAnything

IntrinsicAnything is able to recover object materials from any images and enable single-view image relighting.

17.04.24 · Project Page · Code · Image Segmentation

pix2gestalt

pix2gestalt is able to estimate the shape and appearance of whole objects that are only partially visible behind occlusions.

25.01.24 · Project Page · Code · Image Segmentation · Image Restoration · Image Object Detection

DiffusionMat

DiffusionMat is a novel image matting framework that employs a diffusion model for the transition from coarse to refined alpha mattes. The key innovation of the framework is a correction module that adjusts the output at each denoising step, ensuring that the final result is consistent with the input image’s structures.

02.12.23 · Project Page · Code · Image Segmentation · Image Restoration · Image Editing

Break-A-Scene

Break-A-Scene can extract multiple concepts from a single image using segmentation masks. It allows users to re-synthesize individual concepts or combinations in different contexts, enhancing scene generation with a two-phase customization process.

25.05.23 · Project Page · Code · Image Segmentation · Image Editing · Controllable Image Generation

PAIR-Diffusion

PAIR Diffusion is a generic framework that can enable a diffusion model to control the structure and appearance properties of each object in an image. This allows for various object-level editing operations on real images such as reference image-based appearance editing, free-form shape editing, adding objects, and variations.

30.03.23 · Project Page · Code · Image Editing · Image Segmentation

MultiDiffusion

MultiDiffusion can generate high-quality images using a pre-trained text-to-image diffusion model. It allows users to control aspects like image size and includes features for guiding images with segmentation masks and bounding boxes.

16.02.23 · Project Page · Code · Demo · Controllable Image Generation · Image Segmentation

Neural Congealing

Neural Congealing can align similar content across multiple images using a self-supervised method. It uses pre-trained DINO-ViT features to create a shared semantic map, allowing for effective alignment even with different appearances and backgrounds.

08.02.23 · Project Page · Code · Image Segmentation · Image Editing

Learning Transferable Visual Models From Natural Language Supervision

[Bridging High-Quality Audio and Video via Language for Sound Effects Retrieval from Visual Queries] can retrieve high-quality sound effects from a single video frame without needing text metadata. It uses a combination of large language models and contrastive learning to match sound effects to video better than existing methods.

26.02.21 · Project Page · Code · Image Classification · Image Segmentation