Video AI Tools
Free video AI tools for editing, generating animations, and analyzing footage, perfect for filmmakers and content creators seeking efficiency.
VSTAR is a method that enables text-to-video models to generate longer videos with dynamic visual evolution in a single pass, without finetuning needed.
TCAN can animate characters of various styles from a pose guidance video.
Time Reversal is making it possible to generate in-between frames of two input images. In particular, this enables the generation of looping cinemagraphs as well as camera and subject motion videos.
MotionMaster can extract camera motions from a single source video or multiple videos and apply them to new videos. This enables the model to control camera motions in a more flexible and controllable way, resulting in videos with variable-speed zoom, pan left, pan right, dolly zoom in, dolly zoom out and more.
LVCD can colorize lineart videos using a pretrained video diffusion model. It ensures smooth motion and high video quality by effectively transferring colors from reference frames.
PhysGen can generate realistic videos from a single image and user-defined conditions, like forces and torques. It combines physical simulation with video generation, allowing for precise control over dynamics.
PortraitGen can edit portrait videos using multimodal prompts while keeping the video smooth and consistent. It renders over 100 frames per second and supports various styles like text-driven and relighting, ensuring high quality and temporal consistency.
Upscale-A-Video can upscale low-resolution videos using text prompts while keeping the video stable. It allows users to adjust noise levels for better quality and performs well in both test and real-world situations.
DepthCrafter can generate long high-quality depth map sequences for videos. It uses a three-stage training method with a pre-trained image-to-video diffusion model, achieving top performance in depth estimation for visual effects and video generation.
GVHMR can recover human motion from monocular videos by estimating poses in a Gravity-View coordinate system aligned with gravity and the camera.
ViewCrafter can generate high-quality 3D views from single or few images using a video diffusion model. It allows for precise camera control and is useful for real-time rendering and turning text into 3D scenes.
HumanVid can generate videos from a character photo while allowing users to control both human and camera motions. It introduces a large-scale dataset that combines high-quality real-world and synthetic data, achieving state-of-the-art performance in camera-controllable human image animation.
Follow-Your-Canvas can outpaint videos at higher resolutions, from 512x512 to 1152x2048.
KEEP can enhance video face super-resolution by maintaining consistency across frames. It uses Kalman filtering to improve facial details, working well on both synthetic and real-world videos.
TVG can create smooth transition videos between two images without needing training. It uses diffusion models and Gaussian Process Regression for high-quality results and adds controls for better timing.
[Matryoshka Diffusion Models] can generate high-quality images and videos using a NestedUNet architecture that denoises inputs at different resolutions. This method allows for strong performance at resolutions up to 1024x1024 pixels and supports effective training without needing specific examples.
Puppet-Master can create realistic motion in videos from a single image using simple drag controls. It uses a fine-tuned video diffusion model and all-to-first attention method to make high-quality videos.
Generative Camera Dolly can regenerate a video from any chosen perspective. Still very early, but imagine being able to change any shot or angle in a video after it’s been recorded!
AniTalker is another talking head generator that can animate talking faces from a single portrait and input audio with naturally flowing movements and diverse outcomes.
Audio-Synchronized Visual Animation can animate static images using audio clips to create synchronized visual animations. It uses the AVSync15 dataset and the AVSyncD diffusion model to produce high-quality animations across different audio types.