Video Editing
Free video editing AI tools for quickly producing polished videos, enhancing effects, and automating edits for content creation and marketing.
CustomCrafter can generate high-quality videos from text prompts and reference images. It improves motion generation with a Dynamic Weighted Video Sampling Strategy and allows for better concept combinations without needing extra video or fine-tuning.
CAT4D can create dynamic 4D scenes from single videos. It uses a multi-view video diffusion model to generate videos from different angles, allowing for strong 4D reconstruction and high-quality images.
StableV2V can stabilize shape consistency in video-to-video editing by breaking down the editing process into steps that match user prompts. It handles text-based, image-based, and video inpainting.
AutoVFX can automatically create realistic visual effects in videos from a single image and text instructions.
PortraitGen can edit portrait videos using multimodal prompts while keeping the video smooth and consistent. It renders over 100 frames per second and supports various styles like text-driven and relighting, ensuring high quality and temporal consistency.
Noise Calibration can improve video quality while keeping the original content structure. It uses a noise optimization strategy with pre-trained diffusion models to enhance visuals and ensure consistency between original and enhanced videos.
MimicMotion can generate high-quality videos of arbitrary length mimicking specific motion guidance. The method is able to produce videos of up to 10,000 frames with acceptable resource consumption.
MotionBooth can generate videos of customized subjects from a few images and a text prompt with precise control over both object and camera movements.
MVOC is a training-free multiple video object composition method with diffusion models. The method can be used to composite multiple video objects into a single video while maintaining motion and identity consistency.
Mora can enable generalist video generation through a multi-agent framework. It supports text-to-video generation, video editing, and digital world simulation, achieving performance similar to the Sora model.
ReVideo can change video content in specific areas while keeping the motion intact. It allows users to customize motion paths and uses a three-stage training method for precise video editing.
Slicedit can edit videos with a simple text prompt that retains the structure and motion of the original video while adhering to the target text.
StoryDiffusion can generate long-range images and videos that are able to maintain consistent content across a series of generated frames. The method is able to convert a text-based story into a video with smooth transitions and consistent subjects.
Ctrl-Adapter is a new framework that can be used to add diverse controls to any image or video diffusion model, enabling things like video control with sparse frames, multi-condition control, and video editing.
Motion Inversion can be used to customize the motion of videos by matching the motion of a different video.
AnyV2V can edit videos using prompt-based editing and style transfer without fine-tuning. It modifies the first frame of a video and generates the edited video while keeping high visual quality.
FRESCO combines ControlNet with Ebsynth for zero-shot video translation that focuses on preserving the spatial and temporal consistency of the input frames.
ConsistI2V is an image-to-video method with enhanced visual consistency. Compared to other methods, this one is able to better maintain the subject, background, and style from the first frame, as well as ensure a fluid and logical progression while supporting long video generation as well as camera motion control.
MoonShot is a video generation model that can condition on both image and text inputs. The model is also able to integrate with pre-trained image ControlNet modules for geometry visual conditions, making it possible to generate videos with specific visual appearances and structures.
VidToMe can edit videos with a text prompt, custom models and ControlNet guidance and also achieves great temporal consistency. The critical idea in this one is to merge similar tokens across multiple frames in self-attention modules to achieve temporal consistency in generated videos.