AI Art Weekly #82
Hello there, my fellow dreamers, and welcome to issue #82 of AI Art Weekly! 👋
Greetings from Spain. While the theme parks of the future are in the making and AI agents are taking over phone calls, I’m chilling on the beach and enjoying the sun for the next few days 😎
In this issue:
- 3D: X-Oscar, GS-LRM, MaPa, Invisible Stitch, DGE, DreamScene4D, StableMoFusion
- Video: StoryDiffusion, FlexiFilm, Tunnel Try-on, VimTS, AniTalker, SwapTalk
- Image: Parts2Whole, Anywhere, Pair Customization, MasterWeaver, An Empty Room is All We Want, AT-EDM, Diffusion2GAN
- and more!
Want me to keep up with AI for you? Well, that requires a lot of coffee. If you like what I do, please consider buying me a cup so I can stay awake and keep doing what I do 🙏
Cover Challenge 🎨
For next weeks cover I’m looking for red submissions! Reward is again $50 and a rare role in our Discord community which lets you vote in the finals. Rulebook can be found here and images can be submitted here.
News & Papers
3D
X-Oscar: A Progressive Framework for High-quality Text-guided 3D Animatable Avatar Generation
Not a week goes by without new 3D object generators! X-Oscar can generate high-quality 3D avatars from text prompts which can be animated and edited from within Blender.
GS-LRM: Large Reconstruction Model for 3D Gaussian Splatting
GS-LRM can generate high-quality 3D Gaussian primitives from just 2-4 posed sparse images in 0.23 seconds on a single GPU.
MaPa: Text-driven Photorealistic Material Painting for 3D Shapes
MaPa can generate high-quality materials for 3D meshes! It can create segment-wise procedural material graphs as the appearance representation, which supports high-quality rendering and provides significant flexibility in editing.
Invisible Stitch: Generating Smooth 3D Scenes with Depth Inpainting
Invisible Stitch can inpaint missing depth information in a 3D scene, resulting in improved geometric coherence and smoother transitions between frames.
DGE: Direct Gaussian 3D Editing by Consistent Multi-view Editing
DGE is a Gaussian Splatting method that can be used to edit 3D objects and scenes based on text prompts.
DreamScene4D: Dynamic Multi-Object Scene Generation from Monocular Videos
Not long now until we can turn videos into 4D videos. DreamScene4D can lift multi-object monocular videos into 4D scenes using dynamic Gaussian Splatting while supporting occlusions and Gaussian Motion Trajectories. You’re not gonna win an Oscar with this, but it’s an exciting start.
StableMoFusion: Towards Robust and Efficient Diffusion-based Motion Generation Framework
StableMoFusion is a new method for human motion generation that is able to eliminate foot-skating and create stable and efficient animations. The method is based on diffusion models and can be used for real-time scenarios such as virtual characters and humanoid robots.
Video
StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation
StoryDiffusion can generate long-range images and videos that are able to maintain consistent content across a series of generated frames. The method is able to convert a text-based story into a video with smooth transitions and consistent subjects.
FlexiFilm: Long Video Generation with Flexible Conditions
FlexiFilm is another new text-to-video model which aims to generate long videos, in this case each over 30 seconds in length. Only one example published so far, but this can get once it gets the ZeroScope XL treatment.
Tunnel Try-on: Excavating Spatial-temporal Tunnels for High-quality Virtual Try-on in Videos
Virtual try-on for videos is coming! Tunnel Try-On can preserve the details of clothing and model motions in videos and supports different types of backgrounds and movements.
VimTS: A Unified Video and Image Text Spotter for Enhancing the Cross-domain Generalization
I’m a sucker for data visualization. VimTS can be used to extract textual information from image or video sequences.
AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion Encoding
AniTalker is another talking head generator that can animate talking faces from a single portrait and input audio with naturally flowing movements and diverse outcomes.
SwapTalk: Audio-Driven Talking Face Generation with One-Shot Customization in Latent Space
SwapTalk is yet another audio-driven talking face generator! This one can transfer the facial region of an input avatar onto a target video while lip-syncing to a specified audio clip.
Image
Parts2Whole: From Parts to Whole
Parts2Whole can generate customized human portraits from multiple reference images, including pose images and various aspects of human appearance. The method is able to generate human images conditioned on selected parts from different humans as control conditions, allowing you to create images with specific combinations of facial features, hair, clothes, etc.
Anywhere: A Multi-Agent Framework for Reliable and Diverse Foreground-Conditioned Image Inpainting
Anywhere can place any object from an input image into any suitable and diverse location in an output image. Perfect for product placement.
Pair Customization
I’ve been playing around with SDXL Loras this week, and while it kinda works, it also requires a lot of work to get them to a point where they look good, especially when it comes to the dataset. That’s why I’m always on the lookout for an alternative that requires less. Pair Customization can customize text-to-image models with a single image pair. This allows you to apply a stylistic change to an image without overfitting to the specific image content in your dataset.
MasterWeaver: Taming Editability and Identity for Personalized Text-to-Image Generation
MasterWeaver is yet another personalization method. This can generate and edit photo-realistic images with diverse clothing, accessories, facial attributes and actions in various contexts from a single reference image and text prompt.
An Empty Room is All We Want: Automatic Defurnishing of Indoor Panoramas
An Empty Room is All We Want can remove furniture from indoor panorama images even Jordan Peterson would be proud. Perfect to see how your or the apartment you’re looking at would look like without all the clutter.
AT-EDM: ATtention-Driven Training-Free Efficiency Enhancement of Diffusion Models
Adobe has found a way to reduce the computational cost of diffusion models during inference without retraining. AT-EDM is able to achieve up to 40% FLOPs reduction while maintaining nearly the same image quality as the full model.
Diffusion2GAN: Distilling Diffusion Models into Conditional GANs
Diffusion2GAN is a method to distill a complex multistep diffusion model into a single-step conditional GAN student model, dramatically accelerating inference while preserving image quality. This enables one-step 512px/1024px image generation at an interactive speed of 0.09/0.16 second as well as 4k image upscaling!
Also interesting
- Exploring Vision Transformers for 3D Human Motion-Language Models with Motion Patches
- SwapTalk: Audio-Driven Talking Face Generation with One-Shot Customization in Latent Space
@amorvobiscum is curating an AI exhibition during Art Basel with 48 AI artists dedicated to the timeless theme of ETERNAL PEACE. I’m honoured to be one of them :)
@paultrillo blesses us with a new Sora mesmerizing music video for Washed Out’s new track “The Hardest Part”. I want access so bad 😭
@ScottieFoxTTV created this cool toy called “PromptZone” for his kids. Usability is still a big issue when it comes to Generative AI. This will make it easier to ease the next generation into the the world of diffusion.
@moon__theater and @onchainsherpa collaborated together to create this ephereal AI piece. Love it!
@shanef3d created a beautiful miyazaki-esque short incorporating AI techniques. The behind the scenes are also worth checking out.
And that my fellow dreamers, concludes yet another AI Art weekly issue. Please consider supporting this newsletter by:
- Sharing it 🙏❤️
- Following me on Twitter: @dreamingtulpa
- Buying me a coffee (I could seriously use it, putting these issues together takes me 8-12 hours every Friday 😅)
- Buying a physical art print to hang onto your wall
Reply to this email if you have any feedback or ideas for this newsletter.
Thanks for reading and talk to you next week!
– dreamingtulpa