AI Toolbox · Video

Video Object Detection

Free video object detection AI tools for identifying and tracking 3D objects in videos, enhancing content creation for films and games.

Video AI Tools

Audio-to-Video Controllable Video Generation Image-to-Video Lip Syncing Personalized Video Generation Sketch-to-Video Talking Head Generation Text-to-Video Video Analysis Video Captioning Video Colorization Video Depth Estimation Video Editing Video Generation Video Inpainting Video Interpolation Video Object Detection Video Object Tracking Video Outpainting Video Outpainting Video Editing Video Personalization Video Prediction Video Reconstruction Video Relighting Video Restoration Video Scene Detection Video Style Transfer Video Summarization Video-to-4D Video-to-Audio Video-to-Video Video-to-Video Translation Video Upscaling Virtual Video Try-On

ReferDINO

ReferDINO can segment objects in videos using text descriptions. It improves accuracy with a special mask decoder and enhances understanding of movement over time.

27.06.25 · Project Page · Code · Video Object Detection

MiniMax-Remover

MiniMax-Remover can remove objects from videos efficiently with just 6 sampling steps.

30.05.25 · Project Page · Code · · Video Object Detection · Video Editing

SegAnyMo

SegAnyMo can segment moving objects in videos without needing human labels.

31.03.25 · Project Page · Code · Video Object Detection

MatAnyone

MatAnyone can generate stable and high-quality human video matting masks.

19.02.25 · Project Page · Code · Video Object Detection

VD3D

VD3D enables camera control for video diffusion models and can transfer the camera trajectory from a reference video.

11.02.25 · Project Page · Code · Video Editing · Video Object Detection

Diffusion as Shader

Diffusion as Shader can generate high-quality videos from 3D tracking inputs.

11.02.25 · Project Page · Code · Text-to-Video · Video Editing · Video Object Detection · Controllable Video Generation

Splatter a Video

Splatter a Video can turn a video into a 3D Gaussian representation, allowing for enhanced video tracking, depth prediction, motion and appearance editing, and stereoscopic video generation.

15.01.25 · Project Page · Code · Video Object Detection · Video Editing · Video Depth Estimation

Context-Aware Video Instance Segmentation

CAVIS can do instance segmentation on videos. It’s able to better track objects and improve instance matching accuracy, resulting in more accurate and stable instance segmentation.

05.12.24 · Project Page · Code · Video Object Detection · Video Object Tracking

World-Grounded Human Motion Recovery via Gravity-View Coordinates

GVHMR can recover human motion from monocular videos by estimating poses in a Gravity-View coordinate system aligned with gravity and the camera.

11.09.24 · Project Page · Code · Video Object Detection · Video Analysis

VimTS

VimTS can extract text from images and videos, improving how well it works across different types of media.

30.04.24 · Project Page · Code · Video Object Detection · Image-to-Video

Moving Object Segmentation

FlowSAM can discover and segment moving objects in videos by combining the Segment Anything Model (SAM) with optical flow. It outperforms previous methods, achieving better object identity and sequence-level segmentation for both single and multi-object scenarios.

18.04.24 · Project Page · Code · Video Object Detection

Video-Based Human Pose Regression via Decoupled Space-Time Aggregation

DSTA is a method for video-based human pose estimation which is able to directly map input to output joint coordinates.

29.03.24 · Code · Video Object Detection · Video Analysis

Total-Recon

Total-Recon can render scenes from monocular RGBD videos from different camera angles, like first-person and third-person views. It creates realistic 3D videos of moving objects and allows for 3D filters that add virtual items to people in the scene.

24.04.23 · Project Page · Code · Video Scene Detection · Video Object Detection · Video Analysis