AI Toolbox

Video AI Tools

Free video AI tools for editing, generating animations, and analyzing footage, perfect for filmmakers and content creators seeking efficiency.

Video AI Tools

Audio-to-Video Controllable Video Generation Image-to-Video Lip Syncing Personalized Video Generation Sketch-to-Video Talking Head Generation Text-to-Video Video Analysis Video Captioning Video Colorization Video Depth Estimation Video Editing Video Generation Video Inpainting Video Interpolation Video Object Detection Video Object Tracking Video Outpainting Video Outpainting Video Editing Video Personalization Video Prediction Video Reconstruction Video Relighting Video Restoration Video Scene Detection Video Style Transfer Video Summarization Video-to-4D Video-to-Audio Video-to-Video Video-to-Video Translation Video Upscaling Virtual Video Try-On

FantasyTalking

FantasyTalking can generate talking portraits from a single image, making them look realistic with accurate lip movements and facial expressions. It uses a two-step process to align audio and video, allowing users to control how expressions and body motions appear.

28.04.25 · Project Page · Code · Talking Head Generation · Lip Syncing

Phantom

Phantom can generate videos that keep the subject’s identity from images while matching them with text prompts.

21.04.25 · Project Page · Code · Text-to-Video · Image-to-Video · Personalized Video Generation

SkyReels-V2

SkyReels-V2 can generate infinite-length videos by combining a Diffusion Forcing framework with Multi-modal Large Language Models and Reinforcement Learning.

20.04.25 · Code · Text-to-Video · Image-to-Video

Ev-DeblurVSR

Ev-DeblurVSR can turn low-resolution and blurry videos into high-resolution ones.

19.04.25 · Project Page · Code · Video Restoration · Video Upscaling

FramePack

FramePack aims to make video generation feel like image gen. It can generate single video frames in 1.5 seconds with 13B models on a RTX 4090. Also supports full fps-30 with 13B models using a 6GB laptop GPU, but obviously slower.

18.04.25 · Project Page · Code · Text-to-Video · Image-to-Video

UniAnimate-DiT

UniAnimate-DiT can generate high-quality animations from human images. It uses the Wan2.1 model and a lightweight pose encoder to create smooth and visually appealing results, while also upscaling animations from 480p to 720p.

16.04.25 · Code · Image-to-Video

ReCamMaster

ReCamMaster can re-capture videos from new camera angles.

09.04.25 · Project Page · Code · Video Outpainting Video Editing · Controllable Video Generation · Video-to-Video

NormalCrafter

NormalCrafter can generate consistent surface normals from video sequences. It uses video diffusion models and Semantic Feature Regularization to ensure accurate normal estimation while keeping details clear across frames.

08.04.25 · Project Page · Code · Video Depth Estimation

TTT-Video

TTT-Video can create coherent one-minute videos from text storyboards. As the title of this paper says, this uses test-time training instead of self-attention layers to be able to produce consistent multi-context scenes, which is quite the achievement. The paper is worth a read.

08.04.25 · Project Page · Code · Text-to-Video · Video Generation

VACE

VACE basically adds ControlNet support to video models like Wan and LTX. It handle various video tasks like generating videos from references, video inpainting, pose control, sketch to video and more.

01.04.25 · Project Page · Code · Video Inpainting · Sketch-to-Video · Video Outpainting · Video-to-Video · Video Editing · Image-to-Video

Perception-as-Control

Perception-as-Control can achieve fine-grained motion control for image animation by creating a 3D motion representation from a reference image.

31.03.25 · Project Page · Code · Controllable Video Generation · Image-to-Video

SegAnyMo

SegAnyMo can segment moving objects in videos without needing human labels.

31.03.25 · Project Page · Code · Video Object Detection

AccVideo

AccVideo can speed up video diffusion models by reducing the number of steps needed for video creation. It achieves an 8.5x faster generation speed compared to HunyuanVideo, producing high-quality videos at 720x1280 resolution and 24fps, which makes text-to-video generation way more efficient.

26.03.25 · Project Page · Code · Text-to-Video

CausVid

CausVid can generate high-quality videos at 9.4 frames per second on a single GPU. It supports text-to-video, image-to-video, and dynamic prompting while reducing latency with a causal transformer architecture.

26.03.25 · Project Page · Code · Video-to-Video · Image-to-Video

FloVD

FloVD can generate camera-controllable videos using optical flow maps to show motion.

24.03.25 · Project Page · Code · Video Generation

MotionMatcher

MotionMatcher can customize text-to-video diffusion models using a reference video to transfer motion and camera framing to different scenes.

24.03.25 · Project Page · Code · Text-to-Video

LayerAnimate

LayerAnimate can animate single anime frames from text prompts or interpolate between two frames with or without sketch-guidance. It allows users to adjust foreground and background elements separately.

22.03.25 · Project Page · Code · Video Interpolation · Controllable Video Generation · Image-to-Video

StyleMaster

StyleMaster can stylize videos by transferring artistic styles from images while keeping the original content clear.

21.03.25 · Project Page · Code · Video Style Transfer

PP-VCtrl

PP-VCtrl can turn text-to-video models into customizable video generators. It uses control signals like Canny edges and segmentation masks to improve video quality and control without retraining the models, making it great for character animation and video editing.

21.03.25 · Project Page · Code · Text-to-Video · Image-to-Video · Controllable Video Generation

MagicMotion

MagicMotion can animate objects in videos by controlling their paths with masks, bounding boxes, and sparse boxes.

21.03.25 · Project Page · Code · Controllable Video Generation