AI Art Weekly #125
Hello, my fellow dreamers, and welcome to issue #125 of AI Art Weekly! 👋
A pretty wild week in AI is behind us: OpenAI launched Codex, a new agentic coding tool. Google became the new Google by practically obliterating dozens of startups and workflows overnight with their new AI models and integrations. Sama and Jony are getting married and are having a baby called io. Anthropic released Claude 4 (and it’s a little bit Stasi).
All in all sounds like just another Monday. But the quality is getting so good people are starting to lose touch with what is real and not. Google’s new video model Veo 3 can now produce audio and speech and with results like this, how could you not question the nature of your own reality?
It doesn’t take a genius to see how these models will become so good at simulating reality that there won’t be a need for other custom models and pipelines anymore. Which is kind of the point of this newsletter. But we aren’t quite there yet, so you’ll still hear from me until I pivot to mindful one-liners.
Until then, enjoy touching grass this weekend ✌️
P.S.: Added group style references support to Promptcache. First style is up, more to come 🙌
Support the newsletter and unlock the full potential of AI-generated art with my curated collection of 275+ high-quality Midjourney SREF codes and 2000+ creative prompts.
News & Papers
Highlights
Veo 3
I already wrote about it in the intro: Google shipped new state of the art video and image models called Veo 3 and Imagen 4 this week.
Especially Veo 3 is kind of breaking the internet and the human psyche by how good its output is. It can generate high-quality videos up to 1080p with native audio, including dialogue, sound effects, and ambient noise. The lip-sync accuracy and scene consistency for cinematic outputs are nothing like we have seen before. Describing it doesn’t do it justice, you need to see it yourselves.

Prompt: The scene explodes with the raw, visceral, and unpredictable energy of a hardcore off-road rally, captured with a dynamic, almost found-footage or embedded sports documentary aesthetic. [...]
Imagen 4
Google also released Imagen 4 which comes with better photorealism, finer details, and `advanced text capabiliites while supporting a wide arrange of art styles. Art styles especially is what has been keeping Midjourney at the top so far. Will be interesting to see how Imagen 4 can compete.

The Quest for the Cheesy Treasure (Retro Comic Style): Drawn with slightly pulpy art, Ben Day dots for shading, dramatic close-ups, and lurid colors (orange, deep purples, oranges). [...]"
3D
MVPainter: Accurate and Detailed 3D Texture Generation via Multi-View Diffusion with Geometric Control
MVPainter can generate high-quality 3D textures by aligning reference textures with geometry.

MVPainter examples
EVA: Expressive Virtual Avatars from Multi-view Videos
EVA can generate lifelike human avatars in real time from multi-view videos. It allows for independent control of facial expressions, body movements, and hand gestures, making it ideal for virtual reality, gaming, and remote communication.

EVA example
GA3CE: Unconstrained 3D Gaze Estimation with Gaze-Aware 3D Context Encoding
GA3CE can estimate 3D gaze direction in real-world settings. It’s a stretch, but the example made me think that one day soon we might be able to generate POV footage from a person inside a regular video 🤯

GA3CE example
Image
Custom SVG: Style Customization of Text-to-Vector Generation with Image Diffusion Priors
Custom SVG can generate high-quality SVGs from text prompts with customizable styles.

Custom SVG examples
BAGEL: Emerging Properties in Unified Multimodal Pretraining
BAGEL is a unified multimodal model that can understand and generate images and text, excelling in tasks like image editing and predicting future frames. Basically the open-source version of GPT-4o.

BAGEL examples
Video
MTVCrafter: 4D Motion Tokenization for Open-World Human Image Animation
MTVCrafter can generate high-quality human image animations from 3D motion sequences.

MTVCrafter examples
ISA4D : Interspatial Attention for Efficient 4D Human Video Generation
ISA4D can generate high-quality 4D human videos with control over camera angles and body poses.

ISA4D example

Enjoy the weekend my fellow dreamers! The Murkloom
style is now on Promptcache.
And that my fellow dreamers, concludes yet another AI Art weekly issue. If you like what I do, you can support me by:
- Sharing it 🙏❤️
- Following me on Twitter: @dreamingtulpa
- Buying me a coffee (I could seriously use it, putting these issues together takes me 8-12 hours every Friday 😅)
- Buying my Midjourney prompt collection on PROMPTCACHE 🚀
- Buying access to AI Art Weekly Premium 👑
Reply to this email if you have any feedback or ideas for this newsletter.
Thanks for reading and talk to you next week!
– dreamingtulpa