AI Toolbox

Video AI Tools

Free video AI tools for editing, generating animations, and analyzing footage, perfect for filmmakers and content creators seeking efficiency.

Video AI Tools

Audio-to-Video Controllable Video Generation Image-to-Video Lip Syncing Personalized Video Generation Sketch-to-Video Talking Head Generation Text-to-Video Video Analysis Video Captioning Video Colorization Video Depth Estimation Video Editing Video Generation Video Inpainting Video Interpolation Video Object Detection Video Object Tracking Video Outpainting Video Outpainting Video Editing Video Personalization Video Prediction Video Reconstruction Video Relighting Video Restoration Video Scene Detection Video Style Transfer Video Summarization Video-to-4D Video-to-Audio Video-to-Video Video-to-Video Translation Video Upscaling Virtual Video Try-On

InterActHuman

InterActHuman can generate videos with multiple human characters by matching audio to each person.

16.10.25 · Project Page · Code · Audio-to-Video

MTV

MTV can create high-quality videos that match audio by separating it into speech, effects, and music tracks.

15.10.25 · Project Page · Code · Audio-to-Video · Lip Syncing

VividFace

VividFace can swap faces in videos while keeping the original person’s look and expressions. It handles challenges like keeping the face consistent over time and working well with different angles and lighting.

15.10.25 · Project Page · Code · Video Editing

Mask²DiT

Mask²DiT can generate long videos with multiple scenes by aligning video segments with text descriptions.

14.10.25 · Project Page · Code · Text-to-Video

DiffVSR

DiffVSR can upscale and restore videos by improving their resolution while keeping details clear and stable across frames.

06.10.25 · Project Page · Code · Video Restoration · Video Upscaling

RealisMotion

RealisMotion can generate human videos with realistic motions by separating four key elements: the subject, background, movement path, and actions. It uses a 3D world coordinate system for better motion editing and employs text-to-video diffusion models for high-quality results.

23.09.25 · Project Page · Code · Text-to-Video

CapStARE

CapStARE can achieve high accuracy in gaze estimation. It works in real-time at about 8ms per frame and handles extreme head poses well, making it ideal for interactive systems.

22.09.25 · Code · Video Analysis

Follow-Your-Click

Follow-Your-Click can animate specific regions of an image with a simple user click and a short motion prompt, and allows to control the speed of the animation.

17.09.25 · Project Page · Code · Image-to-Video

Animate-X++

Animate-X++ can animate characters from a single image and a pose sequence while creating dynamic backgrounds.

17.09.25 · Project Page · Code · Image-to-Video

HuMo

HuMo can generate high-quality human-centric videos from text, images, and audio. It ensures that the subjects are preserved and the audio matches the visuals, using advanced training methods for better control.

17.09.25 · Project Page · Code · Text-to-Video · Audio-to-Video · Image-to-Video

Diffuman4D

Diffuman4D can generate high-quality, 4D-consistent videos of human performances from just a few input videos. It uses a spatio-temporal diffusion model to improve the quality of the videos, making them more realistic and consistent than other methods.

11.09.25 · Project Page · Code · Video Inpainting · Video Restoration · Video Editing

3DV-TON

3DV-TON can generate high-quality videos for trying on clothes using 3D models. It handles complex clothing patterns and different body poses well, and it has a strong masking method to reduce errors.

28.08.25 · Project Page · Code · Virtual Video Try-On

CanonSwap

CanonSwap can transfer identities from images to videos while keeping natural movements like head poses and facial expressions.

25.08.25 · Project Page · Code · Video Editing

Hunyuan-GameCraft

Hunyuan-GameCraft can generate interactive game videos by combining keyboard and mouse inputs into a shared camera view.

21.08.25 · Project Page · Code · Image-to-Video

Vivid-VR

Vivid-VR can restore and enhance videos using a text-to-video diffusion transformer. It achieves realistic textures and smooth motion while preserving content and giving users control over the video generation process.

21.08.25 · Project Page · Code · Model · Video Restoration

Lumen

Lumen can replace video backgrounds while adjusting the lighting of the foreground for a consistent look.

20.08.25 · Project Page · Code · Model · Video Editing · Video Relighting

MyTimeMachine

MyTimeMachine can change faces to look older or younger using a global aging model. It needs just 50 selfies to keep the person’s identity, making it great for visual effects and realistic age transformations.

18.08.25 · Project Page · Code · Video Personalization · Video Editing

FantasyPortrait

FantasyPortrait can generate high-quality animations from static images for both single and multi-character scenes.

12.08.25 · Project Page · Code · Controllable Video Generation · Talking Head Generation

ShoulderShot

ShoulderShot can generate over-the-shoulder dialogue videos that keep characters looking the same and maintain a smooth flow between shots. It allows for longer conversations and offers more flexibility in how shots are arranged.

05.08.25 · Project Page · Text-to-Video

VideoColorGrading

VideoColorGrading can generate a look-up table (LUT) for matching colors between reference scenes and input videos.

04.08.25 · Code · Video Colorization