AI Art Weekly #61

Hello there, my fellow dreamers, and welcome to issue #61 of AI Art Weekly! 👋

This week has been the craziest week regarding AI related research since I started writing this newsletter. Usually I skim through roughly 50-80 papers per week to summarize the most interesting advancements for you. This week my paper collection script churned out 186.

On top of that, companies are releasing and announcing new products built on top of that research almost by the day. I already felt how things were picking up during the last few weeks, but this is nuts.

We’re truly living through historical times, so it makes me happy that we cracked the 3’000 subscribers milestone this week. Thank you all for subscribing to my little weekly write-up 🧡

The highlights of the week are:

  • Stable Diffusion Turbo is here
  • Pika 1.0 announced
  • Adobe’s DMD generates images in 90ms
  • Sketch Video Synthesis turns videos into sketches
  • SparseCtrl adds sparse controls to text-to-video models
  • Diffusion Motion Transfer can edit videos with a text prompt
  • MVControl brings ControlNet to 3D generation
  • Material Palette can extract materials from a single image
  • 4D-fy and Dream-in-4D can generate 4D videos
  • LucidDreamer turns a single image into a 3D scene
  • Control4D lets you edit avatars in 4D
  • GAIA generates realistic talking people from a single image and speech clip
  • A new image upscaler called CoSeR
  • LEDITS++ can edit images fast with a text prompt
  • and more tutorials, tools and gems!

Cover Challenge 🎨

Theme: movember
100 submissions by 61 artists
AI Art Weekly Cover Art Challenge movember submission by DonaTimani
🏆 1st: @DonaTimani
AI Art Weekly Cover Art Challenge movember submission by pactalom
🥈 2nd: @pactalom
AI Art Weekly Cover Art Challenge movember submission by Noxifer81
🥉 3rd: @Noxifer81
AI Art Weekly Cover Art Challenge movember submission by EmiliasPod
🧡 4th: @EmiliasPod

News & Papers

Stable Diffusion goes Turbo with Adversarial Diffusion Distillation

Stability AI released the 2nd of their 5 major releases this week. Adversarial Diffusion Distillation is the driver behind the new SD Turbo and SDXL Turbo models which are able to generate high-quality 512x512 images in near real-time at ~200ms.

Faster image generation means new possibilities for interactive and creative applications and people are already exploring different options. If you’re GPU poor you can give Turbo a try on Clipdrop. If not, you can run it locally using Pinokio and ComfyUI.

The speed of SD Turbo enables real-time prompt exploration for instance

Pika 1.0

Pika, one of the most popular video generation tools in the AI art community announced their new 1.0 release this week. Aside from a more accessible web interface, the trailer hints at some new upcoming features:

  • Video inpainting to edit and customize scenes and subjects in videos
  • Video outpainting to adjust video aspect ratios
  • Video-to-video to re-style videos similar to Gen-1

Sign up for the waitlist at pika.art to get access when it’s ready.

Pika 1.0 inpainting sneak peek

DMD: One-step Diffusion with Distribution Matching Distillation

Adobe is not sitting idle besides Stability. DMD is yet another real-time method that is able to generate high-quality images with just a single step in only 90ms, claiming it can generate images at 20 FPS. Unfortunately, it’s not open-source (yet), so no way to confirm nor test it.

DMD speed comparison

Sketch Video Synthesis

After LiveSketch last week, Sketch Video Synthesis is another personal favourite of mine. This one can turn subjects of videos into SVG sketches, enabling various rendering techniques, including resizing, color filling, and overlaying doodles on the original background images.

Sketch Video Synthesis example

SparseCtrl: Adding Sparse Controls to Text-to-Video Diffusion Models

SparseCtrl is a image-to-video method with some cool new capabilities. With its RGB, depth and sketch encoder and one or few input images, it can animate images, interpolate between keyframes, extend videos as well as guide video generation with only depth maps or a few sketches. Especially in love with how scene transitions look like.

SparseCtrl keyframe interpolation example

Diffusion Motion Transfer

Diffusion Motion Transfer is able to translate videos with a text prompt while maintaining the input video’s motion and scene layout.

DMT turning a dog into a horse

MVControl: Adding Conditional Control to Multi-view Diffusion for Controllable Text-to-3D Generation

ControlNet for generative 3D anyone? MVControl is bringing edge & depth map guidance to pre-trained multi-view 2D diffusion models. I can see a pipeline where image-to-edge-map-to-3D might even further improve output.

1girl, full body of virgin mary, holy, open-armed, in digital art illustration style, hyper detailed, smooth, sharp focus, masterpiece, best quality. I can see the prompter is a fellow weeb 😅

Material Palette: Extraction of Materials from a Single Image

Material Palette can extract a palette of PBR materials (albedo, normals, and roughness) from a single real-world image. Looks very useful for creating new materials for 3D scenes or even for generating textures for 2D art.

Material Palette example

4D-fy & Dream-in-4D

4D-fy and Dream-in-4D are two new methods to generate 4D dynamic content from a text prompt or image. They results from both approaches are a bit noisy. Personally, I like it! The funkyness adds a lot of charm to it.

A panda is riding a bicycle. generated with Dream-in-4D

LucidDreamer: Domain-free Generation of 3D Gaussian Splatting Scenes

LucidDreamer can generate navigatable 3D Gaussian Splat scenes out of a single text prompt of a single image. Text prompts can also be chained for more output control. Can’t wait until they can also be animated.

LucidDreamer example

Control4D: Efficient 4D Portrait Editing with Text

Speaking about Gaussian Splatting. Control4D proposes GaussianPlanes, which makes Gaussian Splatting more structured and enhances 4D editing of videos, in this case avatars.

Control4D example

GAIA, SyncTalk, Diffusion Avatars, Portrait4D and CosAvatar

And speaking of avatars, there is even more. It’s been a while since we saw any significant updates when it comes to avatar generations, but this week we not only one or two updates, no, five 🤯! Crazy times ahead.

  • GAIA generates talking heads from a single portrait image and speech clip
  • SyncTalk aims to optimize lip and head motion synchronisation
  • DiffusionAvatars generates high-fidelity 3D avatars which offer pose and expression control.
  • Portrait4D can turn portait images into photorealistic 4D head avatars
  • CosAvatar provides both global style editing and local attribute editing while ensuring strong consistency

GAIA example. Check project page for examples with sound.

CoSeR: Bridging Image and Language for Cognitive Super-Resolution

CoSeR is a Stable Diffusion based upscaling method that can comprehend low-resolution images and generates a high-quality reference image to optimize the super-resolution process. Results look like the best I’ve seen yet.

CoSeR example

Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation

Remember last week? Animate Anyone, just as MagicDance, turns single images and pose guidance into dancing or moving videos, just already much better.

Animate Anyone example

More papers & gems

  • ParaDiffusionPage: Paragraph-to-Image Generation with Information-Enriched Diffusion Model
  • Wired Perspectives: Multi-View Wire Art Embraces Generative AI
  • Animatable Gaussians: Learning Pose-dependent Gaussian Maps for High-fidelity Human Avatar Modeling
  • SceneTex: High-Quality Texture Synthesis for Indoor Scenes via Diffusion Priors
  • GenZI: Zero-Shot 3D Human-Scene Interaction Generation

Tools & Tutorials

These are some of the most interesting resources I’ve come across this week.

Redfall: Initiation” by me available until the 5th of December 6pm CET. Each purchase helps support me and AI Art Weekly 🙏

And that my fellow dreamers, concludes yet another AI Art weekly issue. Please consider supporting this newsletter by:

  • Sharing it 🙏❤️
  • Following me on Twitter: @dreamingtulpa
  • Buying me a coffee (I could seriously use it, putting these issues together takes me 8-12 hours every Friday 😅)
  • Buying a physical art print to hang onto your wall

Reply to this email if you have any feedback or ideas for this newsletter.

Thanks for reading and talk to you next week!

– dreamingtulpa

by @dreamingtulpa