Video AI Tools
Free video AI tools for editing, generating animations, and analyzing footage, perfect for filmmakers and content creators seeking efficiency.
vid2vid-zero can edit videos without needing extra training on video data. It uses image diffusion models for text-to-video alignment and keeps the original video’s look and feel, allowing for effective changes to scenes and subjects.
Text2Video-Zero can generate high-quality videos from text prompts using existing text-to-image diffusion models. It adds motion dynamics and cross-frame attention, making it useful for conditional video generation and instruction-guided video editing.
Blind Video Deflickering by Neural Filtering with a Flawed Atlas can remove flicker from videos without needing extra guidance. It works well on different types of videos and uses a neural atlas for better consistency, outperforming other methods.
3D Cinemagraphy can turn a single still image into a video by adding motion and depth. It uses 3D space to create realistic animations and fix common issues like artifacts and inconsistent movements.
Video-P2P can edit videos using advanced techniques like word swap and prompt refinement. It adapts image generation models for video, allowing for the creation of new characters while keeping original poses and scenes.
[Projected Latent Video Diffusion Models (PVDM)] can generate high-resolution and smooth videos in a low-dimensional space. It achieves a top score of 639.7 on the UCF-101 benchmark, greatly surpassing previous methods.
Dreamix can edit videos based on a text prompt while keeping colors, sizes, and camera angles consistent. It combines low-resolution video data with high-quality content, allowing for advanced editing of motion and appearance.
SceneScape can generate long videos of different scenes from text prompts and camera angles. It ensures 3D consistency by building a unified mesh of the scene, allowing for realistic walkthroughs in places like spaceships and caves.
Shape-aware Text-driven Layered Video Editing can edit the shape of objects in videos while keeping them consistent across frames. It uses a text-conditioned diffusion model to achieve this, making video editing more effective than other methods.
Tune-A-Video can generate videos from a single text-video pair by fine-tuning text-to-image diffusion models. It lets users change subjects, backgrounds, and styles while keeping the video content consistent.
MAGVIT can perform video synthesis tasks like inpainting, outpainting, and generating animations from single images. It is much faster than other models, working 100 times quicker than diffusion models and 60 times faster than autoregressive models, while also achieving the best results on multiple benchmarks.
MotionBERT can recover 3D human motion from noisy 2D observations. It excels in 3D pose estimation, action recognition, and motion prediction, achieving the lowest pose estimation error when trained from scratch.
VToonify can create high-quality artistic portrait videos from images. It allows for controllable style transfer on non-aligned faces and produces smooth, coherent videos with flexible controls on color and intensity.
MCVD can generate videos and predict future and past frames using a masked conditional score-based diffusion model. It achieves high quality and diversity in generated frames, excelling in various video synthesis tasks.