AI Art Weekly #124

Hello, my fellow dreamers, and welcome to issue #124 of AI Art Weekly! 👋

My big client coding project is progressing nicely. The trick is mostly in context management and sticking to single responsibility abstractions. If you’ve got that figured out, “coding” big projects isn’t a challenge at all using LLMs. At the end of the day, it’s just sticking to conventions. I might summarize my workflow and principles if there’s any interest in that.

I’ve also been working on a small update for Promptcache which will let me share SREF combinations that I’ve used for the past couple of covers. But it’s a bit delayed because I can’t split myself in two 😭

But enough about my rambling - AI progress isn’t halting, and as usual, I’ve summarized the latest developments for you. Enjoy your weekend!


News & Papers

Highlights

Stable Audio Open Small

Stable Audio Open Small is a 341 million parameter text-to-audio model, optimized for Arm CPUs, enabling on-device audio generation. It can produce up to 11 seconds of stereo audio in under 8 seconds, ideal for sound effects or short musical riffs on smartphones and edge devices.

The weights are available on HuggingFace.

Stable Audio preview

3D

Generating Physically Stable and Buildable LEGO Designs from Text

LegoGPT can generate stable and buildable LEGO designs from text prompts. It uses physics-aware techniques to ensure designs are safe for manual assembly and robotic construction, and it can create colored and textured models.

a classical guitare

SVAD: From Single Image to 3D Avatar via Synthetic Data Generation with Video Diffusion and Data Augmentation

SVAD can generate high-quality 3D avatars from a single image. It keeps the person’s identity and details consistent across different poses and angles while allowing for real-time rendering.

SVAD example

SOAP: Style-Omniscient Animatable Portraits

SOAP can generate rigged 3D avatars from a single portrait image.

SOAP example

DiffLocks: Generating 3D Hair from a Single Image using Diffusion Models

DiffLocks can generate detailed 3D hair geometry from a single image in 3 seconds.

DiffLocks example

GlobalPose: Improving Global Motion Estimation in Sparse IMU-based Motion Capture with Physics

GlobalPose can capture human motion in 3D space using 6 IMUs (Inertial Measurement Unit). It accurately reconstructs global motions and local poses while estimating 3D contacts and forces.

GlobalPose example

WIR3D: Visually-Informed and Geometry-Aware 3D Shape Abstraction

WIR3D can abstract 3D shapes to enable easy shape changes.

WIR3D example

Image

Marigold: Affordable Adaptation of Diffusion-Based Image Generators for Image Analysis

Marigold can estimate depth, predict surface normals, and decompose images with minimal changes.

Marigold example

LightLab: Controlling Light Sources in Images with Diffusion Models

LightLab can change the intensity and color of light sources in images and adjust ambient lighting.

LightLab example

MonetGPT: Solving Puzzles Enhances MLLMs' Image Retouching Skills

MonetGPT can critique photos and suggest retouching edits. It explains each adjustment clearly, helps keep the subject’s identity, and allows for personalized editing plans.

MonetGPT example

Enjoy the weekend my fellow dreamers! This style is coming to Promptcache soon.

And that my fellow dreamers, concludes yet another AI Art weekly issue. If you like what I do, you can support me by:

  • Sharing it 🙏❤️
  • Following me on Twitter: @dreamingtulpa
  • Buying me a coffee (I could seriously use it, putting these issues together takes me 8-12 hours every Friday 😅)
  • Buying my Midjourney prompt collection on PROMPTCACHE 🚀
  • Buying access to AI Art Weekly Premium 👑

Reply to this email if you have any feedback or ideas for this newsletter.

Thanks for reading and talk to you next week!

– dreamingtulpa

by @dreamingtulpa