AI Art Weekly #133

Hello, my fellow dreamers, and welcome to issue #133 of AI Art Weekly! πŸ‘‹

It’s been a while. I needed a bit of time off as I’ve been slowly sliding into burnout (oopsie πŸ™ˆ). A lot has happened in this last month, not only am I now officially on Spotify, but there have also been a ton of new interesting releases, which I will again try to bring back to you in a weekly format 🀞

So enjoy the read and until next week.


News & Papers

Highlights

  • ByteDance released Seedance 4.0, the latest state-of-the-art image model for generation and editing.
  • Luma released Ray3, a reasoning video model that supports studio-grade HDR, quite expensive though.
  • World Labs released an image-to-3D-world model. It takes an image and generates a fully navigatable and peristent Gaussian Splat world (with a minimap).
  • Tencent released SRPO, a RL-framework that can improve style capabilities of text-to-image models like FLUX.1-dev.
  • EbSynth is back with V2. This tool lets you edit a single frame in a video and repopulate the changes across nearby frames.

3D

SemLayoutDiff: Semantic Layout Generation with Diffusion Model for Indoor Scene Synthesis

SemLayoutDiff can generate diverse 3D indoor scenes by creating detailed semantic maps and placing furniture while considering doors and windows.

SemLayoutDiff example

LongSplat: Robust Unposed 3D Gaussian Splatting for Casual Long Videos

LongSplat can create high-quality 3D scenes from long videos without needing camera positions.

LongSplat example

StyleSculptor: Zero-Shot Style-Controllable 3D Asset Generation with Texture-Geometry Dual Guidance

StyleSculptor can generate 3D assets from a content image and style images without needing extra training.

StyleSculptor example

GeoSAM2: Unleashing the Power of SAM2 for 3D Part Segmentation

GeoSAM2 can segment 3D meshes into parts using 2D prompts.

GeoSAM2 example

Image

OmniTry: Virtual Try-On Anything without Masks

OmniTry can let users try on jewelry and accessories without needing a mask.

OmniTry example

Chord: Chain of Rendering Decomposition for PBR Material Estimation from Generated Texture Images

Chord can generate high-quality PBR materials from texture images. It uses a fine-tuned SDXL for texture creation and allows users to edit materials flexibly, performing well on both generated and real-world images.

Chord example

OmniStyle2: Scalable and High Quality Artistic Style Transfer Data Generation via Destylization

OmniStyle2 can generate high-quality artistic style transfer data by removing stylistic elements from artworks to recover natural content.

OmniStyle2 example

Video

HuMo: Human-Centric Video Generation via Collaborative Multi-Modal Conditioning

HuMo can generate high-quality human-centric videos from text, images, and audio. It ensures that the subjects are preserved and the audio matches the visuals, using advanced training methods for better control.

HuMo example

Vivid-VR: Distilling Concepts from Text-to-Video Diffusion Transformer for Photorealistic Video Restoration

Vivid-VR can restore and enhance videos using a text-to-video diffusion transformer. It achieves realistic textures and smooth motion while preserving content and giving users control over the video generation process.

Vivid-VR example

Lumen: Consistent Video Relighting and Harmonious Background Replacement with Video Generative Models

Lumen can replace video backgrounds while adjusting the lighting of the foreground for a consistent look.

Lumen example

InfinityHuman: Towards Long-Term Audio-Driven Human

InfinityHuman can generate high-resolution, long-duration videos of human animations driven by audio. It improves lip synchronization and keeps the person’s appearance consistent.

InfinityHuman example

OmniHuman-1.5: Instilling an Active Mind in Avatars via Cognitive Simulation

OmniHuman-1.5 can generate expressive character animations from a single image and audio track. It captures emotions and intent accurately, achieving high lip-sync accuracy and natural motion quality.

OmniHuman-1.5 example

Audio

DreamAudio: Customized Text-to-Audio Generation with Diffusion Models

DreamAudio can generate customized audio samples from user concepts and text prompts. It allows precise control over sound features and shows high consistency with the input text.

DreamAudio Spectogram example

conspiracy against the human race --chaos 100 --ar 4:3 --exp 100 --raw --sref 3249290888 1936742898 225813348 4164996227 416325198 --stylize 1000 --weird 1000

And that my fellow dreamers, concludes yet another AI Art weekly issue. If you like what I do, you can support me by:

  • Sharing it πŸ™β€οΈ
  • Following me on Twitter: @dreamingtulpa
  • Buying me a coffee (I could seriously use it, putting these issues together takes me 8-12 hours every Friday πŸ˜…)
  • Buying my Midjourney prompt collection on PROMPTCACHE πŸš€
  • Buying access to AI Art Weekly Premium πŸ‘‘

Reply to this email if you have any feedback or ideas for this newsletter.

Thanks for reading and talk to you next week!

– dreamingtulpa

by @dreamingtulpa