AI Art Weekly #41

Hello there, my fellow dreamers, and welcome to issue #41 of AI Art Weekly! 👋

We’ve finally surpassed the 2000 subscribers milestone 🥳. Thank you all for your support and for being part of this community. People say the bigger the audience gets, the easier it’ll be to grow. I’ve found this to be absolutely not true. Getting the last 250 of you onboard has been the toughest challenge yet, so I’m glad you made it all here 🧡

Let’s jump into this weeks issue, here are the highlights:

  • Stable Diffusion XL 0.9 weights leaked and 1.0 released date revealed
  • New Midjourney panning feature
  • Artistic Cinemagraph: Animating images from text
  • DragonDiffusion: DragGAN for diffusion models
  • SketchMetaFace: Sketching 3D faces
  • DreamIdentity: Efficient face-identity preserved image generation from a single image
  • Voicebox: Meta’s new speech synthesis model
  • RobustL2S: Lip-to-speech synthesis 🤯
  • Interview with AI artists Polza and DEHISCENCE

Cover Challenge 🎨

Theme: renaissance
108 submissions by 66 artists
AI Art Weekly Cover Art Challenge renaissance submission by newmediapioneer
🏆 1st: @newmediapioneer
AI Art Weekly Cover Art Challenge renaissance submission by UnoParticular
🥈 2nd: @UnoParticular
AI Art Weekly Cover Art Challenge renaissance submission by shany_gin
🥉 3rd: @shany_gin
AI Art Weekly Cover Art Challenge renaissance submission by Eye_em_AI
🥉 3rd: @Eye_em_AI

News & Papers

Stable Diffusion XL 0.9 weights leaked, 1.0 release date and a new Midjourney panning feature

The Stable Diffusion XL 0.9 weights got leaked on HuggingFace this week, and they got as promptly removed as they were put up. Some folks were fast enough to snatch them though, and so they are still available through a torrent. Emad shared that people should wait with training or relying to much on 0.9 though, as 1.0 will have additional RLHF fine-tuning compared to 0.9 which apparently will make a big difference and gets released on July 18th according to the man himself.

Midjourney meanwhile pushed a new panning feature which lets you pan upscales horizontally or vertically.

Panorama made with the new Midjourney panning feature. Not really a good representation of the quality at this scale, but you get the idea.

Artistic Cinemagraph: Synthesizing Artistic Cinemagraphs from Text

The recent AI advancements in image segmentation are enabling new possbilities. Artistic Cinemagraph for instance makes it possible to animate flowing clouds or water in images fully automated from text descriptions. It’s also possible to describe in which direction the water should flow for instance. The best part about this, is that it also works on existing images and paintings. So you can also bring shots from your last holidays to life.

Text-To-Cinemagraph examples

DragonDiffusion: Enabling Drag-style Manipulation on Diffusion Models

It’s been 7 weeks since DragGAN got announced, and one week since the official implementation got released. This week, we got DragonDiffusion. Basically the DragGAN equivalent but for diffusion models.

Early DragonDiffusion implementation examples (DragDiffusion)

SketchMetaFace: A Learning-based Sketching Interface for High-fidelity 3D Character Face Modeling

Similar like ControlNet scribble for images, SketchMetaFace brings sketch guidance to the 3D realm and makes it possible to turn a sketch into a 3D face model. Pretty excited about progress like this, as this will bring controllability to 3D generations and make generating 3D content way more accessible.

SketchMetaFace demo

DreamIdentity: Improved Editability for Efficient Face-identity Preserved Image Generation

Having to train a model on additional concepts and faces might be soon a thing of the past. Given only one facial image, DreamIdentity can efficiently generate countless identity-preserved and text-coherent images in different context without any test-time optimization.

DreamIdentity examples


Meta announced Voicebox a few weeks ago, an impressive speech synthesis model that can generate speech from text, transfer the style of an audio input, edit spoken text or remove background noise from audio clips. This will probably get never released as an open-source model, but still interesting to see what will be possible in the near future.

Voicebox Text-to-Speech visualization. Video with sound within the link above.

RobustL2S: Speaker-Specific Lip-to-Speech Synthesis via Self-Supervised Learning

Ok, this is a crazy one. We’ve seen a lot of research around the same concepts in the past few weeks (image, video, 3D, speech and audio), but I get especially excited when I see something new that I wasn’t aware could be possible. RobustL2S is one of these cases. RobustL2S is a lip-to-speech synthesis model. That means it can transform video footage of lips moving into audio 🤯 I can’t wait to try this on some text-to-video output if the code ever gets released.

RobustL2S architecture

More gems

  • DisCo: Disentangled Control for Referring Human Dance Generation in Real World (it’s over for the TikTok dancers)
  • HNC-CAD: Hierarchical Neural Coding for Controllable CAD Model Generation
  • Proxycap: Real-time Monocular Full-body Capture in World Space via Sequential Proxy-to-Motion Learning
  • DiT-3D: Exploring Plain Diffusion Transformers for 3D Shape Generation
  • Magic123: One Image to High-Quality 3D Object Generation Using Both 2D and 3D Diffusion Priors


Because the AISurrealism interviews were received so well, @annadart_artist and I decided to bring you two more this week.

Tools & Tutorials

These are some of the most interesting resources I’ve come across this week.

full body shot of medieval knight with compressed gum wearing flowing robes playing poker with zebras on the roof of skyscraper, still life analog ultra minimalist photography, confessional, life-sized installations, high angle, muted hues, 1990s --q 1 --ar 3:2 --style raw --c 15. Made by me in our Discord.

And that my fellow dreamers, concludes yet another AI Art weekly issue. Please consider supporting this newsletter by:

  • Sharing it 🙏❤️
  • Following me on Twitter: @dreamingtulpa
  • Buying me a coffee (I could seriously use it, putting these issues together takes me 8-12 hours every Friday 😅)

Reply to this email if you have any feedback or ideas for this newsletter.

Thanks for reading and talk to you next week!

– dreamingtulpa

by @dreamingtulpa