AI Art Weekly #42
Hello there, my fellow dreamers, and welcome to issue #42 of AI Art Weekly! 👋
Good news. We’ve reached 25% coverage for the Twitter API fees and I’m confident that we’ll be able to cover the rest of the costs maybe not right away, but hopefully soon. Thank you so much to everyone who contributed so far. It takes some pressure of the financials and I’m deeply grateful for your support 🙏
That being said, we’ve had another interesting week behind us. I personally spend a lot of time working on my Claire Silver “motion” contest submission and there have been some new interesting papers & resources. Let’s jump in:
- AnimateDiff: Text-to-Video with any Stable Diffusion models
- CSD-Edit: Multi modality editing for 4k images and video
- HyperDreamBooth: Personalize a text-to-image diffusion model 25x faster than DreamBooth
- Animate-A-Story: Storytelling with text-to-video generation
- PGR: Facial reenactment through a personalized generator
- VampNet: Audio to loops and variations
- Interview with TRUE CAMELLIA
- Stable Doodle released
- and more
Twitter recently shut down free API access which puts our weekly cover challenges at risk. By becoming a supporter, you can help me make AI Art Weekly and its community efforts more sustainable by supporting its development & growth as well as covering monthly fees! Every contribution is deeply appreciated 🙏
Cover Challenge 🎨
With these temperatures I’m in the mood for some “funk” for next weeks cover. The reward is $50. Rulebook can be found here and images can be submitted here. Come join our Discord to talk challenges. I’m looking forward to your submissions 🙏
News & Papers
AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning
AnimateDiff is a new framework that brings video generation to the Stable Diffusion pipeline. Meaning you can generate videos with any already existing Stable Diffusion models without having to fine-tune or train anything. Pretty amazing. @DigThatData put together a Google Colab notebook in case you want to give it a try.

AnimateDiff examples from different Stable Diffusion models
CSD-Edit: Collaborative Score Distillation for Consistent Visual Synthesis
CSD-Edit is a novel multi modality editing approach that compared to other methods works great on images bigger than the traditional 512x512 limitation and can edit 4k or large panorama images, has improved temporal consistency on video frames as well as improved view consistency when editing or generating 3D scenes.

CSD-Edit video editing comparison with other methods
HyperDreamBooth: HyperNetworks for Fast Personalization of Text-to-Image Models
The team behind DreamBooth is back with a new paper that introduces HyperDreamBooth. The new method tackles the size and speed issues of DreamBooth, while preserving model integrity, editability and subject fidelity and is able to personalize a text-to-image diffusion model 25x faster than DreamBooth with only a single input image.

HyperDreamBooth example
Animate-A-Story: Storytelling with Retrieval-Augmented Video Generation
Animate-A-Story is a video storytelling approach which can synthesize high-quality, structured, and character driven videos. Composition and scene transitions are still early days, but it’s interesting to see how a first text-to-story pipeline looks like.

Animate-A-Story example
PGR: Facial Reenactment Through a Personalized Generator
After seeing PGR this week, I think it’s safe to say that we can’t trust anything anymore we see online. From a creative perspective though, down the road tech like this could potentially be used for replacing actors in movie projects. Basically you or your friends play all the roles yourself and something like PGR will reenact the actor you want for the role, whoever that may be.

PGR example
VampNet: Music Generation via Masked Acoustic Token Modeling
VampNet is a music generation model that can create loops and variations from short musical excerpts and can be fine-tuned with LoRA on custom audio datasets like playlists or specific albums.
More gems & papers
- 3D VADER: AutoDecoding Latent 3D Diffusion Models
- My3DGen: Building Lightweight Personalized 3D Generative Model
- Divide, Evaluate, and Refine: Evaluating and Improving Text-to-Image Alignment with Iterative VQA Feedback
- LSV: Efficient 3D articulated human generation with layered surface volumes
- DATENCODER: Domain-Agnostic Tuning-Encoder for Fast Personalization of Text-To-Image Models
- FreeDrag: Point Tracking is Not You Need for Interactive Point-based Image Editing
My submission LIVE.DIE.REPEAT. for @ClaireSilver12 5th AI art contest with the theme “motion” got some love when I published it on Tuesday. So I thought I would put together a “Behind-The-Scenes” explanation on how it as made. Enjoy.
@skirano has been expanding scenes from the movie Raiders of the Lost Ark with generative fill AI.
@ilumine_ai has been exploring AI’s potential in the video game industry and put together a crazy highlight reel.
Interviews
This week @annadart_artist and I talked to AISurreaslim artist @truecamellia.
Tools & Tutorials
These are some of the most interesting resources I’ve come across this week.
Stability.AI published a new ClipDrop tool called Stable Doodle that uses the T2I adapter behind the scene that lets you use sketches to guide image generation.
While SDXL support for Automatic1111 is on its way, ComfyUI already supports it. It’s a powerful and modular stable diffusion GUI with a graph/nodes interface. Not everyones cup of tea, but one you might enjoy.
If you’re new to ZeroScope text-to-video, @fofrAI put together a short and handy guide to cover the basic settings like fps, steps and upscaling.
@madaro_art put together a Twitter thread about Midjourney blending techniques. Interesting read.
@MatthewPStewart put together an interesting summary discussing the Author’s Guild v. Google District Court case which decided that using copyrighted material in a dataset to train a discriminative machine-learning algorithm is perfectly legal – which sets a precedent for training generative AI models.

Whispers of memories breathe through the minimal colorful modern landscapes of 'Silent Echoes,' a twilight analogue photograph, where stark contrasts of black and vibrant whites converge, evoking haunting expressionist undertones inspired by René Magritte and Daido Moriyama, amidst contemporary political tension in a totalitarian state --s 250 --v 5.2 --style raw --ar 3:2
. Prompt by @moelucio and made in our Discord.
And that my fellow dreamers, concludes yet another AI Art weekly issue. Please consider supporting this newsletter by:
- Sharing it 🙏❤️
- Following me on Twitter: @dreamingtulpa
- Buying me a coffee (I could seriously use it, putting these issues together takes me 8-12 hours every Friday 😅)
Reply to this email if you have any feedback or ideas for this newsletter.
Thanks for reading and talk to you next week!
– dreamingtulpa