AI Art Weekly #133
Hello, my fellow dreamers, and welcome to issue #133 of AI Art Weekly! π
Itβs been a while. I needed a bit of time off as Iβve been slowly sliding into burnout (oopsie π). A lot has happened in this last month, not only am I now officially on Spotify, but there have also been a ton of new interesting releases, which I will again try to bring back to you in a weekly format π€
So enjoy the read and until next week.
Support the newsletter and unlock the full potential of AI-generated art with my curated collection of 375+ high-quality Midjourney SREF codes and 4000+ creative prompts.
News & Papers
Highlights
- ByteDance released Seedance 4.0, the latest state-of-the-art image model for generation and editing.
- Luma released Ray3, a reasoning video model that supports studio-grade HDR, quite expensive though.
- World Labs released an image-to-3D-world model. It takes an image and generates a fully navigatable and peristent Gaussian Splat world (with a minimap).
- Tencent released SRPO, a RL-framework that can improve style capabilities of text-to-image models like FLUX.1-dev.
- EbSynth is back with V2. This tool lets you edit a single frame in a video and repopulate the changes across nearby frames.
3D
SemLayoutDiff: Semantic Layout Generation with Diffusion Model for Indoor Scene Synthesis
SemLayoutDiff can generate diverse 3D indoor scenes by creating detailed semantic maps and placing furniture while considering doors and windows.

SemLayoutDiff example
LongSplat: Robust Unposed 3D Gaussian Splatting for Casual Long Videos
LongSplat can create high-quality 3D scenes from long videos without needing camera positions.

LongSplat example
StyleSculptor: Zero-Shot Style-Controllable 3D Asset Generation with Texture-Geometry Dual Guidance
StyleSculptor can generate 3D assets from a content image and style images without needing extra training.

StyleSculptor example
GeoSAM2: Unleashing the Power of SAM2 for 3D Part Segmentation
GeoSAM2 can segment 3D meshes into parts using 2D prompts.

GeoSAM2 example
Image
OmniTry: Virtual Try-On Anything without Masks
OmniTry can let users try on jewelry and accessories without needing a mask.

OmniTry example
Chord: Chain of Rendering Decomposition for PBR Material Estimation from Generated Texture Images
Chord can generate high-quality PBR materials from texture images. It uses a fine-tuned SDXL for texture creation and allows users to edit materials flexibly, performing well on both generated and real-world images.

Chord example
OmniStyle2: Scalable and High Quality Artistic Style Transfer Data Generation via Destylization
OmniStyle2 can generate high-quality artistic style transfer data by removing stylistic elements from artworks to recover natural content.

OmniStyle2 example
Video
HuMo: Human-Centric Video Generation via Collaborative Multi-Modal Conditioning
HuMo can generate high-quality human-centric videos from text, images, and audio. It ensures that the subjects are preserved and the audio matches the visuals, using advanced training methods for better control.

HuMo example
Vivid-VR: Distilling Concepts from Text-to-Video Diffusion Transformer for Photorealistic Video Restoration
Vivid-VR can restore and enhance videos using a text-to-video diffusion transformer. It achieves realistic textures and smooth motion while preserving content and giving users control over the video generation process.

Vivid-VR example
Lumen: Consistent Video Relighting and Harmonious Background Replacement with Video Generative Models
Lumen can replace video backgrounds while adjusting the lighting of the foreground for a consistent look.

Lumen example
InfinityHuman: Towards Long-Term Audio-Driven Human
InfinityHuman can generate high-resolution, long-duration videos of human animations driven by audio. It improves lip synchronization and keeps the personβs appearance consistent.

InfinityHuman example
OmniHuman-1.5: Instilling an Active Mind in Avatars via Cognitive Simulation
OmniHuman-1.5 can generate expressive character animations from a single image and audio track. It captures emotions and intent accurately, achieving high lip-sync accuracy and natural motion quality.

OmniHuman-1.5 example
Audio
DreamAudio: Customized Text-to-Audio Generation with Diffusion Models
DreamAudio can generate customized audio samples from user concepts and text prompts. It allows precise control over sound features and shows high consistency with the input text.

DreamAudio Spectogram example

conspiracy against the human race --chaos 100 --ar 4:3 --exp 100 --raw --sref 3249290888 1936742898 225813348 4164996227 416325198 --stylize 1000 --weird 1000
And that my fellow dreamers, concludes yet another AI Art weekly issue. If you like what I do, you can support me by:
- Sharing it πβ€οΈ
- Following me on Twitter: @dreamingtulpa
- Buying me a coffee (I could seriously use it, putting these issues together takes me 8-12 hours every Friday π )
- Buying my Midjourney prompt collection on PROMPTCACHE π
- Buying access to AI Art Weekly Premium π
Reply to this email if you have any feedback or ideas for this newsletter.
Thanks for reading and talk to you next week!
β dreamingtulpa