AI Toolbox
A curated collection of 610 free cutting edge AI papers with code and tools for text, image, video, 3D and audio generation and manipulation.
Love this one! SVGCustomization is a novel pipeline that is able to edit existing vector images with text prompts while preserving the properties and layer information vector images are made of.
SynTalker can generate realistic full-body motions that match speech and text prompts. It allows precise control of movements, like talking while walking.
DreamCatalyst can edit NeRF scenes in only about 25 minutes or produce high-quality results in less than 70 minutes.
GScream is yet another method for object removal in 3D scenes. This one uses Gaussian Splatting to update the radiance field and is able to preserve geometric consistency and texture coherence.
MotionMaster can extract camera motions from a single source video or multiple videos and apply them to new videos. This enables the model to control camera motions in a more flexible and controllable way, resulting in videos with variable-speed zoom, pan left, pan right, dolly zoom in, dolly zoom out and more.
MambaPainter can turn images into an oil painting style by predicting over 100 brush strokes in one step.
LVCD can colorize lineart videos using a pretrained video diffusion model. It ensures smooth motion and high video quality by effectively transferring colors from reference frames.
Gaussian-Informed Continuum for Physical Property Identification and Simulation can recover 3D objects from Gaussian point sets and simulate their physical properties.
StructLDM can generate animatable compositional humans by blending different body parts, identity swapping, local clothing editing, 3D virtual try-on, etc. AI girlfriends/boyfriends are definitely gonna be a thing.
RodinHD can generate high-fidelity 3D avatars from a portrait image. The method is able to capture intricate details such as hairstyles and can generalize to in-the-wild portrait input.
TeFF is a similar method to SphereHead, this one supports more than just human faces and can reconstruct a 3D object from the 360 view of a single image.
ProCreate boosts the diversity and creativity of diffusion-based image generation while avoiding the replication of training data. By pushing generated image embeddings away from reference images, it improves the quality of samples and lowers the risk of copying copyrighted content.
AudioEditor can edit audio by adding, deleting, and replacing segments while keeping unedited parts intact. It uses a pretrained diffusion model with methods like Null-text Inversion and EOT-suppression to ensure high-quality results.
MaskedMimic can generate diverse motions for interactive characters using a physics-based controller. It supports various inputs like keyframes and text, allowing for smooth transitions and adaptation to complex environments.
Prompt Sliders can control and edit concepts in diffusion models. It allows users to adjust the strength of concepts with just 3KB of storage per embedding, making it much faster than traditional LoRA methods.
3D-Fauna is able to turn a single image of a quadruped animal into an articulated, textured 3D mesh in a feed-forward manner, ready for animation and rendering.
PhysGen can generate realistic videos from a single image and user-defined conditions, like forces and torques. It combines physical simulation with video generation, allowing for precise control over dynamics.
StoryMaker can generate a series of images with consistent characters across multiple images. It keeps the same facial features, clothing, hairstyles, and body types, allowing for cohesive storytelling.
PortraitGen can edit portrait videos using multimodal prompts while keeping the video smooth and consistent. It renders over 100 frames per second and supports various styles like text-driven and relighting, ensuring high quality and temporal consistency.
WiLoR can localize and reconstruct multiple hands in real-time from single images. It achieves smooth 3D hand tracking with high accuracy, using a large dataset of over 2 million hand images.