Image-to-Video
Free image-to-video AI tools for quickly transforming images into dynamic videos, perfect for content creators and filmmakers.
DisPose can generate high-quality human image animations from sparse skeleton pose guidance.
ObjCtrl-2.5D enables object control in image-to-video generation using 3D trajectories from 2D inputs with depth information.
Inverse Painting can generate time-lapse videos of the painting process from a target artwork. It uses a diffusion-based renderer to learn from real artists’ techniques, producing realistic results across different artistic styles.
CamI2V is a method which can generate videos from images with precise control over camera movements and text prompts.
JoyVASA can generate high-quality lip-sync videos of human and animal faces from a single image and speech clip.
DimensionX can generate photorealistic 3D and 4D scenes from a single image using controllable video diffusion.
SG-I2V can control object and camera motion in image-to-video generation using bounding boxes and trajectories
Pyramidal Flow Matching can generate high-quality 5 to 10-second videos at 768p resolution and 24 FPS. It uses a unified pyramidal flow matching algorithm to link flows across different stages, making video creation more efficient.
TCAN can animate characters of various styles from a pose guidance video.
Time Reversal is making it possible to generate in-between frames of two input images. In particular, this enables the generation of looping cinemagraphs as well as camera and subject motion videos.
PhysGen can generate realistic videos from a single image and user-defined conditions, like forces and torques. It combines physical simulation with video generation, allowing for precise control over dynamics.
Puppet-Master can create realistic motion in videos from a single image using simple drag controls. It uses a fine-tuned video diffusion model and all-to-first attention method to make high-quality videos.
LivePortrait can animate a single source image with motion from a driving video. The method is able to generate high-quality videos at 60fps and is able to retarget the motion to other characters.
Motion Prompting can control video generation using motion paths. It allows for camera control, motion transfer, and drag-based image editing, producing realistic movements and physics.
MVOC is a training-free multiple video object composition method with diffusion models. The method can be used to composite multiple video objects into a single video while maintaining motion and identity consistency.
Conditional Image Leakage can be used to generate videos with more dynamic and natural motion from image prompts.
Image Conductor can generate video assets from a single image with precise control over camera transitions and object movements.
Mora can enable generalist video generation through a multi-agent framework. It supports text-to-video generation, video editing, and digital world simulation, achieving performance similar to the Sora model.
VimTS can extract text from images and videos, improving how well it works across different types of media.
TRIP is a new approach to image-to-video generation with better temporal coherence.