How time flies! Tomorrow, one year ago, I published the first issue of AI Art Weekly. A lot has happened since then. AI has grown from incoherent low-resolution image generators to fully fledged AI companions that will soon be integrated into our everyday operating systems. Whether you think these changes are exciting, scary, or both, I think we can all agree that it at least hasn’t been boring. And from the looks of it, the future will only get wilder. A bit more than 2500 of you have decided to join me on this wild ride into the future, and I’m more than grateful to have you all on board. Let us explore together what the future of AI holds for us. Thank you for your support 🙏

In Today’s issue we cover:

  • DALL·E 3 gets integrated into ChatGPT
  • ProPainter can inpaint videos: remove watermarks and objects
  • ControlNet-XS – the lightweight version of ControlNet
  • Repainting 3D Assets with text prompts
  • SVGCustomization – text-guided vector graphics customization
  • PGDiff – guiding diffusion models for versatile face restoration
  • and more tutorials, tools and gems!

Cover Challenge 🎨

Theme: equinox
104 submissions by 61 artists
AI Art Weekly Cover Art Challenge equinox submission by pactalom
🏆 1st: @pactalom
AI Art Weekly Cover Art Challenge equinox submission by themyelo
🥈 2nd: @themyelo
AI Art Weekly Cover Art Challenge equinox submission by kaoru_creation
🥉 3rd: @kaoru_creation
AI Art Weekly Cover Art Challenge equinox submission by guthrie_cd
🥉 3rd: @guthrie_cd

News & Papers

DALL·E 3 x ChatGPT

OpenAI announced this week that DALL·E 3 will be built natively on ChatGPT. This will let you use ChatGPT as a brainstorming partner for generating images. Release will be early October and Microsoft announcent this week that it’ll be freely available in Bing as well. Will this already be the start of the end for prompt engineers? Depending on how good the images are that DALL·E 3 will produce, I can see this being a shift for big players like Midjourney.

DALL·E 3 will enable iterative image generation through natural conversations

ProPainter: Improving Propagation and Transformer for Video Inpainting

ProPainter is a new video inpainting method that is able to remove objects, complete masked videos, remove watermarks and even expand the view of a video. Code can be found on GitHub.

ProPainter lets you remove watermarks from videos


ControlNet-XS shows that it’s able to achieve state-of-the-art results with as little as 1% of the parameters of the base model, considerably better than ControlNet in terms of FID score. Code hasn’t been released yet, but this will probably make ControlNet extremely more lightweight.

ControlNet-XS results

Repainting 3D Assets

As we move more and more into 3D, texturing 3D assets will become an important part. Repainting 3D Assets is a new method that can take any 3D asset and paint it with a given text prompt. The results, while low-resolution, are pretty impressive.

Repainting 3D Assets example

SVGCustomization: Text-Guided Vector Graphics Customization

Love this one! SVGCustomization is a novel pipeline that is able to edit existing vector images with text prompts while preserving the properties and layer information vector images are made of.

SVGCustomization examples

PGDiff: Guiding Diffusion Models for Versatile Face Restoration via Partial Guidance

PGDiff is a new method that can be used for a broad range of face restoration tasks, including blind restoration, colorization, inpainting, reference-based restoration and old photo restoration. The method is able to restore images with partial guidance, meaning you can guide the model with a rough sketch or a few strokes instead of a fully colored image. Especially older but still popular models like Stable Diffusion are still plagued from deformed faces, a pipeline combined with face recognition and PGDiff could improve those.

PGDiff pipeline and examples

More papers & gems

  • FreeU: Free Lunch in Diffusion U-Net
  • DreamLLM: Synergistic Multimodal Comprehension and Creation
  • SPHP: Sparse and Privacy-enhanced Representation for Human Pose Estimation
  • MOVIN: Real-time Motion Capture using a Single LiDAR
  • JDD: Dual-Camera Joint Deblurring-Denoising
  • CartoonDiff: Training-Free Cartoon Image Generation with Diffusion Transformer Models
  • LayoutNUWA: Revealing the Hidden Layout Expertise of Large Language Models

Tools & Tutorials

These are some of the most interesting resources I’ve come across this week.

OBEY 👁️ by me. Explorations of the QRcode monster technique with subliminal text messages. More examples here.

