Hello there, my fellow dreamers, and welcome to issue #68 of AI Art Weekly! 👋
I’ve begun diving my teeth into a new project. Hopefully I can tell you all more about it next week 🤫 But until then, lets check out this weeks AI art news!
- Google announced Lumiere – a new video model
- ActAnywhere generates video backgrounds
- 3DHM animates people with 3D camera control
- Depth Anything is a new monocular depth estimation method
- pix2gestalt estimates and inpaints the shape and appearance of occluded objects
- UltrAvatar generates realistic and animatable 3D avatars
- Diffuse to Choose allows to virtually place any item in any setting
- GALA splits single-layer clothed 3D human meshes into multi-layered 3D assets
- CreativeSynth is a new SOTA method for style retention in image generation
- Interview with AI power house AInigma
- and more!
Cover Challenge 🎨
News & Papers
Lumiere: A Space-Time Diffusion Model for Video Generation
Lumiere is Google’s latest video model and it looks wild! The model was trained on a dataset of 30 million videos, along with their text captions, and is capable of generating 80 frames at 16 fps. It supports text-to-video, image-to-video, video inpainting and stylization. Unfortunately, Google has a track record of not releasing their models, but one can still hope
ActAnywhere: Subject-Aware Video Background Generation
Given a subject sequence and a background image, ActAnywhere can generate video backgrounds that match the foreground motions. Pretty cool!
3DHM: Synthezing Moving People with 3D Control
3DHM can animate people with 3D camera control from a single image and a given target video motion sequence.
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Depth Anything is a new monocular depth estimation method. The model is trained on 1.5M labeled images and 62M+ unlabeled images, which results in impressive generalization ability.
pix2gestalt: Amodal Segmentation by Synthesizing Wholes
pix2gestalt is able to estimate the shape and appearance of whole objects that are only partially visible behind occlusions.
UltrAvatar: A Realistic Animatable 3D Avatar Diffusion Model with Authenticity Guided Textures
UltrAvatar can generate realistic and animatable 3D avatars with PBR textures from a text prompt or a single image. The framework is also capable of texture editing, allowing you to change eye and hair colors, add aging effects, and even tattoos to your avatars.
Diffuse to Choose: Enriching Image Conditioned Inpainting in Latent Diffusion Models for Virtual Try-All
Shooting product images for e-commerce shops is a time intensive task, but Diffuse to Choose by Amazon can help with that. The inpainting model allows to virtually place any item in any setting with detailed and semantically coherent blending as well as realistic lighting and shadows.
GALA: Generating Animatable Layered Assets from a Single Scan
GALA can turn a single-layer clothed 3D human mesh and decompose it into complete multi-layered 3D assets. The outputs can then be combined with other assets to create new clothed human avatars with any pose.
CreativeSynth: Creative Blending and Synthesis of Visual Arts based on Multimodal Diffusion
Retaining the style of a reference image when editing and blending images is a continued challenge. CreativeSynth outperforms other methods in those tasks. It’s not open-source yet, but the example images look promising.
- DITTO: Diffusion Inference-Time T-Optimization for Music Generation
- Hourglass Diffusion Transformers: Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers
- RL Diffusion: Large-scale Reinforcement Learning for Diffusion Models
- GenMoStyle: Generative Human Motion Stylization in Latent Space
- EmerDiff: Emerging Pixel-level Semantic Knowledge in Diffusion Models
This week we’re talking to the AI power house that is AInigma 🔥
Tools & Tutorials
These are some of the most interesting resources I’ve come across this week.
And that my fellow dreamers, concludes yet another AI Art weekly issue. Please consider supporting this newsletter by:
- Sharing it 🙏❤️
- Following me on Twitter: @dreamingtulpa
- Buying me a coffee (I could seriously use it, putting these issues together takes me 8-12 hours every Friday 😅)
- Buying a physical art print to hang onto your wall
Reply to this email if you have any feedback or ideas for this newsletter.
Thanks for reading and talk to you next week!