Hello there, my fellow dreamers, and welcome to issue #56 of AI Art Weekly! 👋
Two major developments from the generative AI art front this week. Apple has released research for Matryoshka Diffusion Models, their take on text-to-image models. Latent Consistency Models might be the next evolution of Diffusion models which can create images much faster in 1-4 steps. Let’s jump in:
- Latent Consistency Models – faster text-to-image!
- Apple’s Matryoshka Diffusion Models
- FreeNoise can create text-to-video with 512 frames
- DreamCraft3D: High quality 3D generation
- Wonder3D is another image to 3D method
- Zero123++ can generate multi-view images from a single input image
- E4S is a new method for fine-grained face swapping
- and more tutorials, tools and gems!
Cover Challenge 🎨
News & Papers
Latent Consistency Models: Synthesizing High-Resolution Images with Few-step Inference
There is a new category of generative models emerging, called Latent Consistency Models (LCMs). These models can be distilled from pre-trained Stable Diffusion models and are able to generate high quality 768x768 resolution images in only one to four steps, significantly accelerating text-to-image generation. For comparison, traditional diffusion models require 20-50 steps. Early signs show that this will bump up the speed of image generation to 100ms on powerful GPUs with some further optimizations.
Matryoshka Diffusion Models
Apple is getting into the generative AI game. Matryoshka Diffusion Models (MDM) are their latest research for generating high-quality text-to-image & text-to-video with a multi-resolution diffusion model that can generate results at a resolution of up to 1024x1024 pixels. Compared to Stable Diffusion or Google’s Imagine, the MDM doesn’t require a pre-trained VAE or any additional upscaling modules and can be trained much more efficient. The code isn’t available yet, but will apparently get released soon.
FreeNoise: Tuning-Free Longer Video Diffusion via Noise Rescheduling
FreeNoise is a new method that can generate longer videos with up to 512 frames from multiple text prompts. That’s about 21 seconds for a 24fps video. The method doesn’t require any additional fine-tuning on the video diffusion model and only takes about 20% more time compared to the original diffusion process.
DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior
3D generations are getting more sophisticated by the week. DreamCraft3D can create high-quality 3D objects from a single prompt. It uses a 2D reference image to guide the sculpting of the 3D object and then improves texture fidelity by running it through a fine-tuned Dreambooth model.
Wonder3D: Single Image to 3D using Cross-Domain Diffusion
Wonder3D is yet another image-to-3D method. This one is able to convert a single image into a high-fidelity 3D model, complete with textured meshes and color. The entire process takes only 2 to 3 minutes.
Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model
E4S: Fine-Grained Face Swapping via Regional GAN Inversion
E4S is a new method for fine-grained face swapping. It’s able to swap faces in images and videos, while preserving the source identity, texture, shape, and lighting of the original footage.
More papers & gems
- MAS: Multi-view Ancestral Sampling for 3D motion generation using 2D diffusion
- Relit-NeuLF: Efficient Novel View Synthesis with Neural 4D Light Field
- PERF: Panoramic Neural Radiance Field from a Single Panorama
- DreamSpace: Dreaming Your Room Space with Text-Driven Panoramic Texture Propagation
Tools & Tutorials
These are some of the most interesting resources I’ve come across this week.
And that my fellow dreamers, concludes yet another AI Art weekly issue. Please consider supporting this newsletter by:
- Sharing it 🙏❤️
- Following me on Twitter: @dreamingtulpa
- Buying me a coffee (I could seriously use it, putting these issues together takes me 8-12 hours every Friday 😅)
- Buy a physical art print to hang onto your wall
Reply to this email if you have any feedback or ideas for this newsletter.
Thanks for reading and talk to you next week!