AI Art Weekly #69

Hello there, my fellow dreamers, and welcome to issue #69 of AI Art Weekly! 👋

I’ve been busy working on my new AI powered product this week and can finally reveal… its name 😅 It’s called Shortie and it’s a tool that will help create short-form videos with AI. But before I can reveal more, let’s dive into this week’s Generative AI art news and papers. The highlights of this week are:

  • Midjourney’s Niji v6 and Style References
  • AnimateLCM for real-time video generation
  • Motion-I2V Motion Brush for controllable image-to-video
  • VR-GS for interactive 3D Gaussian splats in VR
  • Gaussian Splashing for dynamic fluid synthesis
  • AToM for text-to-mesh
  • Media2Face for co-speech facial animations
  • Anything in Any Scene for photorealistic video object insertion
  • SEELE for repositioning subjects within an image
  • StableIdentity for inserting anybody into any scene
  • and more!

Cover Challenge 🎨

Theme: spicy
118 submissions by 77 artists
AI Art Weekly Cover Art Challenge spicy submission by ManoelKhan
🏆 1st: @ManoelKhan
AI Art Weekly Cover Art Challenge spicy submission by AIstronaut42
🥈 2nd: @AIstronaut42
AI Art Weekly Cover Art Challenge spicy submission by brockwebb
🥉 3rd: @brockwebb
AI Art Weekly Cover Art Challenge spicy submission by John_Synthetic
🥉 3rd: @John_Synthetic

News & Papers

Midjourney: Niji v6 and Style References

Midjourney released the new Niji v6 model this week. It’s a model that is specifically tuned on Eastern and anime aesthetics. But the more exciting feature are the new Style References. The new --sref <urlA> option helps to guide the model with reference images to create images with a more consistent style. I’m already addicted again.

Image created by me with multiple Style References (#1 and #2)

AnimateLCM: Accelerating the Animation of Personalized Diffusion Models and Adapters with Decoupled Consistency Learning

Last year we got real-time diffusion for images, this year we’ll get it for video! AnimateLCM can generate high-fidelity videos with minimal steps. The model also supports image-to-video as well as support for adapters like ControlNet. It’s not available yet, but once it hits, expect way more AI generated video content.

AnimateLCM example that got generated in 4 steps

Motion-I2V: Consistent and Controllable Image-to-Video Generation with Explicit Motion Modeling

Being able to iterate fast is one important step in generative AI, another is controllability. Motion-I2V’s framework not only seems to surpass commercial solutions like Pika and RunwayML in image-to-video tasks, but also offers features like Motion Brush, Motion Drag as well as video-to-video with incredible results. The only downside to this? There is no code 😭

Motion-I2V example

VR-GS: A Physical Dynamics-Aware Interactive Gaussian Splatting System in Virtual Reality

With Apple’s Vision Pro being released today, creating for 3D becomes more important by the day (if the device can find adoption). But 3D/VR is hard. Luckily for us, there is AI.

VR-GS allows users to interact with 3D Gaussian kernels in VR and can generate realistic dynamic responses and illumination in real-time, making it possible to manipulate objects and scenes with physically plausible results.

VR-GS in action

Gaussian Splashing: Dynamic Fluid Synthesis with Gaussian Splatting

Interaction is one thing, what about liquids? Gaussian Splashing combines position-based dynamics and 3DGS and allows for the simulation of physical interactions of dynamic fluids and solids with Gaussian Splats.

Gaussian Splashing example

AToM: Amortized Text-to-Mesh using 2D Diffusion

Gaussian Splats are one option, but what about good old 3D meshes? Well, AToM is a new text-to-mesh framework that can generate high-quality textured 3D meshes from text prompts in less than a second. The method is optimized across multiple prompts and is able to create diverse objects for which it wasn’t trained on.

AToM examples

Media2Face: Co-speech Facial Animation Generation With Multi-Modality Guidance

Media2Face is able to generate 3D facial animations from speech, audio, text, and image prompts. The model also can control expressions for each frame either with a reference image or text prompt. Wow.

Media2Face example

Anything in Any Scene: Photorealistic Video Object Insertion

Anything in Any Scene is a method that can insert objects into videos while maintaining the same level of photorealism as the original footage. The model is able to handle occlusions and lighting conditions and can even generate shadows for the inserted objects 🤯

AIAS example, can you spot the inserted object?

SEELE: Repositioning The Subject Within Image

SEELE can move around objects within an image. It does so by removing it, inpainting occluded portions and harmonizing the appearance of the repositioned object with the surrounding areas.

SEELE example

StableIdentity: Inserting Anybody into Anywhere at First Sight

StableIdentity is yet another method that can generate diverse customized images in various contexts from a single input image. The cool thing about this method is, that it is able to combine the learned identity with ControlNet and even inject it into video (ModelScope) and 3D (LucidDreamer) generation.

StableIdentity examples

Also interesting


Interview

This week we’re talking to machine learning VJ Vadim Epstein.


Tools & Tutorials

These are some of the most interesting resources I’ve come across this week.

Symbiosis” by me available on objkt

And that my fellow dreamers, concludes yet another AI Art weekly issue. Please consider supporting this newsletter by:

  • Sharing it 🙏❤️
  • Following me on Twitter: @dreamingtulpa
  • Buying me a coffee (I could seriously use it, putting these issues together takes me 8-12 hours every Friday 😅)
  • Buying a physical art print to hang onto your wall

Reply to this email if you have any feedback or ideas for this newsletter.

Thanks for reading and talk to you next week!

– dreamingtulpa

by @dreamingtulpa