AI Art Weekly #125

Hello, my fellow dreamers, and welcome to issue #125 of AI Art Weekly! 👋

A pretty wild week in AI is behind us: OpenAI launched Codex, a new agentic coding tool. Google became the new Google by practically obliterating dozens of startups and workflows overnight with their new AI models and integrations. Sama and Jony are getting married and are having a baby called io. Anthropic released Claude 4 (and it’s a little bit Stasi).

All in all sounds like just another Monday. But the quality is getting so good people are starting to lose touch with what is real and not. Google’s new video model Veo 3 can now produce audio and speech and with results like this, how could you not question the nature of your own reality?

It doesn’t take a genius to see how these models will become so good at simulating reality that there won’t be a need for other custom models and pipelines anymore. Which is kind of the point of this newsletter. But we aren’t quite there yet, so you’ll still hear from me until I pivot to mindful one-liners.

Until then, enjoy touching grass this weekend ✌️

P.S.: Added group style references support to Promptcache. First style is up, more to come 🙌


News & Papers

Highlights

Veo 3

I already wrote about it in the intro: Google shipped new state of the art video and image models called Veo 3 and Imagen 4 this week.

Especially Veo 3 is kind of breaking the internet and the human psyche by how good its output is. It can generate high-quality videos up to 1080p with native audio, including dialogue, sound effects, and ambient noise. The lip-sync accuracy and scene consistency for cinematic outputs are nothing like we have seen before. Describing it doesn’t do it justice, you need to see it yourselves.

Prompt: The scene explodes with the raw, visceral, and unpredictable energy of a hardcore off-road rally, captured with a dynamic, almost found-footage or embedded sports documentary aesthetic. [...]

Imagen 4

Google also released Imagen 4 which comes with better photorealism, finer details, and `advanced text capabiliites while supporting a wide arrange of art styles. Art styles especially is what has been keeping Midjourney at the top so far. Will be interesting to see how Imagen 4 can compete.

The Quest for the Cheesy Treasure (Retro Comic Style): Drawn with slightly pulpy art, Ben Day dots for shading, dramatic close-ups, and lurid colors (orange, deep purples, oranges). [...]"

3D

MVPainter: Accurate and Detailed 3D Texture Generation via Multi-View Diffusion with Geometric Control

MVPainter can generate high-quality 3D textures by aligning reference textures with geometry.

MVPainter examples

EVA: Expressive Virtual Avatars from Multi-view Videos

EVA can generate lifelike human avatars in real time from multi-view videos. It allows for independent control of facial expressions, body movements, and hand gestures, making it ideal for virtual reality, gaming, and remote communication.

EVA example

GA3CE: Unconstrained 3D Gaze Estimation with Gaze-Aware 3D Context Encoding

GA3CE can estimate 3D gaze direction in real-world settings. It’s a stretch, but the example made me think that one day soon we might be able to generate POV footage from a person inside a regular video 🤯

GA3CE example

Image

Custom SVG: Style Customization of Text-to-Vector Generation with Image Diffusion Priors

Custom SVG can generate high-quality SVGs from text prompts with customizable styles.

Custom SVG examples

BAGEL: Emerging Properties in Unified Multimodal Pretraining

BAGEL is a unified multimodal model that can understand and generate images and text, excelling in tasks like image editing and predicting future frames. Basically the open-source version of GPT-4o.

BAGEL examples

Video

MTVCrafter: 4D Motion Tokenization for Open-World Human Image Animation

MTVCrafter can generate high-quality human image animations from 3D motion sequences.

MTVCrafter examples

ISA4D : Interspatial Attention for Efficient 4D Human Video Generation

ISA4D can generate high-quality 4D human videos with control over camera angles and body poses.

ISA4D example

Enjoy the weekend my fellow dreamers! The Murkloom style is now on Promptcache.

And that my fellow dreamers, concludes yet another AI Art weekly issue. If you like what I do, you can support me by:

  • Sharing it 🙏❤️
  • Following me on Twitter: @dreamingtulpa
  • Buying me a coffee (I could seriously use it, putting these issues together takes me 8-12 hours every Friday 😅)
  • Buying my Midjourney prompt collection on PROMPTCACHE 🚀
  • Buying access to AI Art Weekly Premium 👑

Reply to this email if you have any feedback or ideas for this newsletter.

Thanks for reading and talk to you next week!

– dreamingtulpa

by @dreamingtulpa