AI Toolbox

Video AI Tools

Free video AI tools for editing, generating animations, and analyzing footage, perfect for filmmakers and content creators seeking efficiency.

Video AI Tools

Audio-to-Video Controllable Video Generation Image-to-Video Lip Syncing Personalized Video Generation Sketch-to-Video Talking Head Generation Text-to-Video Video Analysis Video Captioning Video Colorization Video Depth Estimation Video Editing Video Generation Video Inpainting Video Interpolation Video Object Detection Video Object Tracking Video Outpainting Video Outpainting Video Editing Video Personalization Video Prediction Video Reconstruction Video Relighting Video Restoration Video Scene Detection Video Style Transfer Video Summarization Video-to-4D Video-to-Audio Video-to-Video Video-to-Video Translation Video Upscaling Virtual Video Try-On

MagicDriveDiT

MagicDriveDiT can generate high-resolution street scene videos for self-driving cars.

08.01.24 · Project Page · Code · Controllable Video Generation

Moonshot

MoonShot is a video generation model that can condition on both image and text inputs. The model is also able to integrate with pre-trained image ControlNet modules for geometry visual conditions, making it possible to generate videos with specific visual appearances and structures.

03.01.24 · Project Page · Code · Personalized Video Generation · Video Editing

VidToMe

VidToMe can edit videos with a text prompt, custom models and ControlNet guidance and also achieves great temporal consistency. The critical idea in this one is to merge similar tokens across multiple frames in self-attention modules to achieve temporal consistency in generated videos.

17.12.23 · Project Page · Code · Video Editing

Wan-Animate

Wan-Animate can animate characters from images by copying their expressions and movements from a video. It also allows for seamless character replacement in videos, keeping the original lighting and color tone for a consistent look.

13.12.23 · Project Page · Code · Video Editing

FreeInit

FreeInit can improve the quality of videos made by diffusion models without extra training. It fixes issues between training and use, making videos look better and more consistent.

12.12.23 · Project Page · Code · Text-to-Video · Video Editing

Hierarchical Spatio-temporal Decoupling for Text-to-Video Generation

Hierarchical Spatio-temporal Decoupling for Text-to-Video Generation can generate realistic and stable videos by separating spatial and temporal factors. It improves video quality by extracting motion and appearance cues, allowing for flexible content variations and better understanding of scenes.

07.12.23 · Project Page · Code · Text-to-Video

MotionCtrl

MotionCtrl is a flexible motion controller that is able to manage both camera and object motions in the generated videos and can be used with VideoCrafter1, AnimateDiff Stable Video Diffusion.

06.12.23 · Project Page · Code · Video Editing

StyleCrafter

Given one or more style references, StyleCrafter can generate images and videos based on these referenced styles.

01.12.23 · Project Page · Code · Text-to-Video

Space-Time Diffusion Features for Zero-Shot Text-Driven Motion Transfer

Diffusion Motion Transfer is able to translate videos with a text prompt while maintaining the input video’s motion and scene layout.

28.11.23 · Project Page · Code · Text-to-Video

Sketch Video Synthesis

Sketch Video Synthesis can turn videos into SVG sketches using frame-wise Bézier curves. It allows for impressive visual effects like resizing, color filling, and adding doodles to original images while maintaining a smooth flow between frames.

26.11.23 · Project Page · Code · Video Editing

Breathing Life Into Sketches Using Text-to-Video Priors

LiveSketch can automatically add motion to a single-subject sketch by providing a text prompt indicating the desired motion. The output are short SVG animations which can be easily edited.

21.11.23 · Project Page · Code · Text-to-Video · Image-to-Video

Clearer Frames, Anytime

InterpAny-Clearer is a video frame interpolation method that is able to generate clearer and sharper frames compared to existing methods. Additionally, it introduces the ability to manipulate the interpolation of objects in a video independently, which could be useful for video editing tasks.

14.11.23 · Project Page · Code · Video Inpainting · Video Editing

I2VGen-XL

I2VGen-XL can generate high-quality videos from static images using a cascaded diffusion model. It achieves a resolution of 1280x720 and improves the flow of movement in videos through a two-stage process that separates detail enhancement from overall coherence.

07.11.23 · Project Page · Code · Image-to-Video

VideoDreamer

VideoDreamer is a framework that is able to generate videos that contain the given subjects and simultaneously conform to text prompts.

02.11.23 · Project Page · Code · Text-to-Video

SEINE

SEINE is a short-to-long video diffusion model that focuses on generative transitions and predictions. The goal is to generate high-quality long videos with smooth and creative transitions between scenes and varying lengths of clips. The model can also be used for image-to-video animation and autoregressive video prediction.

31.10.23 · Project Page · Code · Text-to-Video · Video Editing · Video Summarization

FreeNoise

FreeNoise is a method that can generate longer videos with up to 512 frames from multiple text prompts. That’s about 21 seconds for a 24fps video. The method doesn’t require any additional fine-tuning on the video diffusion model and only takes about 20% more time compared to the original diffusion process.

23.10.23 · Project Page · Code · Text-to-Video

MotionDirector

MotionDirector is a method that can train text-to-video diffusion models to generate videos with the desired motions from a reference video.

12.10.23 · Project Page · Code · Text-to-Video · Video Editing

SadTalker

SadTalker can generate talking head videos from a single image and audio. It creates realistic head movements and expressions by linking audio to 3D motion, improving video quality and coherence.

10.10.23 · Project Page · Code · Talking Head Generation

FLATTEN

FLATTEN can improve the visual flow of edited videos by using optical flow in diffusion models. This method enhances the consistency of video frames without needing extra training.

09.10.23 · Project Page · Code · Text-to-Video · Video Editing

Ground-A-Video

Ground-A-Video can edit multiple attributes of a video using pre-trained text-to-image models without any training. It maintains consistency across frames and accurately preserves non-target areas, making it more effective than other editing methods.

02.10.23 · Project Page · Code · Video Editing