Video Editing
Free video editing AI tools for quickly producing polished videos, enhancing effects, and automating edits for content creation and marketing.
SEINE is a short-to-long video diffusion model that focuses on generative transitions and predictions. The goal is to generate high-quality long videos with smooth and creative transitions between scenes and varying lengths of clips. The model can also be used for image-to-video animation and autoregressive video prediction.
MotionDirector is a method that can train text-to-video diffusion models to generate videos with the desired motions from a reference video.
FLATTEN can improve the visual flow of edited videos by using optical flow in diffusion models. This method enhances the consistency of video frames without needing extra training.
Ground-A-Video can edit multiple attributes of a video using pre-trained text-to-image models without any training. It maintains consistency across frames and accurately preserves non-target areas, making it more effective than other editing methods.
Hierarchical Masked 3D Diffusion Model for Video Outpainting can fill in missing parts at the edges of video frames while keeping the motion smooth. It uses a smart method that reduces errors and improves results by looking at multiple frames.
StableVideo is yet another vid2vid method. This one is not just a style transfer though, the method is able to differentiate between fore- and background when editing a video, making it possible to reimagine the subject within an entirely different landscape.
TokenFlow is a new video-to-video method for temporal coherent video editing with text. We’ve seen a lot of them, but this one looks extremely good with almost no flickering and requires no fine-tuning whatsoever.
We’ve already seen a few attempts at bringing ControlNet to video, but getting temporal coherency right seems to be a trick issue to solve. ControlVideo is the next attempt and things start to look extremely promising.
VideoComposer can generate videos with control over how they look and move using text, sketches, and motion vectors. It improves video quality by ensuring frames match well, allowing for flexible video creation and editing.
Control-A-Video can generate controllable text-to-video content using diffusion models. It allows for fine-tuned customization with edge and depth maps, ensuring high quality and consistency in the videos.
Make-A-Protagonist can edit videos by changing the protagonist, background, and style using text and images. It allows for detailed control over video content, helping users create unique and personalized videos.
Follow Your Pose can generate character videos that match specific poses from text descriptions. It uses a two-stage training process with pre-trained text-to-image models, allowing for continuous pose control and editing.
vid2vid-zero can edit videos without needing extra training on video data. It uses image diffusion models for text-to-video alignment and keeps the original video’s look and feel, allowing for effective changes to scenes and subjects.
Blind Video Deflickering by Neural Filtering with a Flawed Atlas can remove flicker from videos without needing extra guidance. It works well on different types of videos and uses a neural atlas for better consistency, outperforming other methods.
Video-P2P can edit videos using advanced techniques like word swap and prompt refinement. It adapts image generation models for video, allowing for the creation of new characters while keeping original poses and scenes.
Dreamix can edit videos based on a text prompt while keeping colors, sizes, and camera angles consistent. It combines low-resolution video data with high-quality content, allowing for advanced editing of motion and appearance.
Shape-aware Text-driven Layered Video Editing can edit the shape of objects in videos while keeping them consistent across frames. It uses a text-conditioned diffusion model to achieve this, making video editing more effective than other methods.
MAGVIT can perform video synthesis tasks like inpainting, outpainting, and generating animations from single images. It is much faster than other models, working 100 times quicker than diffusion models and 60 times faster than autoregressive models, while also achieving the best results on multiple benchmarks.
VToonify can create high-quality artistic portrait videos from images. It allows for controllable style transfer on non-aligned faces and produces smooth, coherent videos with flexible controls on color and intensity.