Hello there, my fellow dreamers, and welcome to issue #42 of AI Art Weekly! 👋
Good news. We’ve reached 25% coverage for the Twitter API fees and I’m confident that we’ll be able to cover the rest of the costs maybe not right away, but hopefully soon. Thank you so much to everyone who contributed so far. It takes some pressure of the financials and I’m deeply grateful for your support 🙏
That being said, we’ve had another interesting week behind us. I personally spend a lot of time working on my Claire Silver “motion” contest submission and there have been some new interesting papers & resources. Let’s jump in:
- AnimateDiff: Text-to-Video with any Stable Diffusion models
- CSD-Edit: Multi modality editing for 4k images and video
- HyperDreamBooth: Personalize a text-to-image diffusion model 25x faster than DreamBooth
- Animate-A-Story: Storytelling with text-to-video generation
- PGR: Facial reenactment through a personalized generator
- VampNet: Audio to loops and variations
- Interview with TRUE CAMELLIA
- Stable Doodle released
- and more
Cover Challenge 🎨
News & Papers
AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning
AnimateDiff is a new framework that brings video generation to the Stable Diffusion pipeline. Meaning you can generate videos with any already existing Stable Diffusion models without having to fine-tune or train anything. Pretty amazing. @DigThatData put together a Google Colab notebook in case you want to give it a try.
CSD-Edit: Collaborative Score Distillation for Consistent Visual Synthesis
CSD-Edit is a novel multi modality editing approach that compared to other methods works great on images bigger than the traditional 512x512 limitation and can edit 4k or large panorama images, has improved temporal consistency on video frames as well as improved view consistency when editing or generating 3D scenes.
HyperDreamBooth: HyperNetworks for Fast Personalization of Text-to-Image Models
The team behind DreamBooth is back with a new paper that introduces HyperDreamBooth. The new method tackles the size and speed issues of DreamBooth, while preserving model integrity, editability and subject fidelity and is able to personalize a text-to-image diffusion model 25x faster than DreamBooth with only a single input image.
Animate-A-Story: Storytelling with Retrieval-Augmented Video Generation
Animate-A-Story is a video storytelling approach which can synthesize high-quality, structured, and character driven videos. Composition and scene transitions are still early days, but it’s interesting to see how a first text-to-story pipeline looks like.
PGR: Facial Reenactment Through a Personalized Generator
After seeing PGR this week, I think it’s safe to say that we can’t trust anything anymore we see online. From a creative perspective though, down the road tech like this could potentially be used for replacing actors in movie projects. Basically you or your friends play all the roles yourself and something like PGR will reenact the actor you want for the role, whoever that may be.
VampNet: Music Generation via Masked Acoustic Token Modeling
VampNet is a music generation model that can create loops and variations from short musical excerpts and can be fine-tuned with LoRA on custom audio datasets like playlists or specific albums.
More gems & papers
- 3D VADER: AutoDecoding Latent 3D Diffusion Models
- My3DGen: Building Lightweight Personalized 3D Generative Model
- Divide, Evaluate, and Refine: Evaluating and Improving Text-to-Image Alignment with Iterative VQA Feedback
- LSV: Efficient 3D articulated human generation with layered surface volumes
- DATENCODER: Domain-Agnostic Tuning-Encoder for Fast Personalization of Text-To-Image Models
- FreeDrag: Point Tracking is Not You Need for Interactive Point-based Image Editing
Tools & Tutorials
These are some of the most interesting resources I’ve come across this week.
And that my fellow dreamers, concludes yet another AI Art weekly issue. Please consider supporting this newsletter by:
- Sharing it 🙏❤️
- Following me on Twitter: @dreamingtulpa
- Buying me a coffee (I could seriously use it, putting these issues together takes me 8-12 hours every Friday 😅)
Reply to this email if you have any feedback or ideas for this newsletter.
Thanks for reading and talk to you next week!