AI Art Weekly #126

Hello, my fellow dreamers, and welcome to issue #126 of AI Art Weekly! πŸ‘‹

Every week when some major closed-source model releases, I think to myself that all other research is going to be obsolete from here on out. But there are always some new gems here and there that I really like. So here we are again.

I’m drowning in work, so I’m gonna keep this intro short. Enjoy the weekend, everybody!


News & Papers

Highlights

FLUX.1 Kontext

Black Forest Labs released FLUX.1 Kontext, a suite of multimodal flow matching models that enables in-context image generation and editing.

The Kontext models enable:

  • Character consistency across multiple scenes and environments
  • Local editing for targeted modifications without affecting the rest of the image
  • Style reference preservation while generating new scenes
  • Interactive speed with up to 8x faster inference than current leading models
  • Iterative editing allowing step-by-step refinements while maintaining quality

The suite includes FLUX.1 Kontext [pro] and FLUX.1 Kontext [max] via their Playground and partners, plus FLUX.1 Kontext [dev] (12B parameters) will be open-sourced as soon as it’s out of private beta.

left: input image; middle: edit from input: tilt her head towards the camera, right: make her laugh

3D

Triangle Splatting for Real-Time Radiance Field Rendering

After NeRFs and Gaussian Splatting we got Triangle Splatting. A new method that can render real-time radiance fields at over 2,400 FPS with a 1280x720 resolution. It combines triangle representations with differentiable rendering for better visual quality and faster results than Gaussian splatting methods.

Triangle Splatting example

UniTEX: Universal High Fidelity Generative Texturing for 3D Shapes

UniTEX can generate high-quality textures for 3D assets without using UV mapping. It maps 3D points to texture values based on surface proximity and uses a transformer-based model for better texture quality.

UniTEX example

Direct3D-S2: Gigascale 3D Generation Made Easy with Spatial Sparse Attention

Direct3D-S2 can generate high-resolution 3D shapes.

Direct3D-S2 examples

RenderFormer: Transformer-based Neural Rendering of Triangle Meshes with Global Illumination

RenderFormer can render images from triangle mesh representations with full global illumination effects.

RenderFormer example

WeatherEdit: Controllable Weather Editing with 4D Gaussian Field

WeatherEdit can generate realistic weather effects in 3D scenes with control over type and severity. It uses a dynamic 4D Gaussian field for weather particles and ensures consistency across images, making it ideal for simulations like autonomous driving in bad weather.

WeatherEdit example

Weather-Magician: Reconstruction and Rendering Framework for 4D Weather Synthesis In Real Time

Weather-Magician can add realistic 4D weather effects like haze, fog, rain, and snow to real scenes.

Weather-Magician example

Image

OmniPainter: Training-free Stylized Text-to-Image Generation with Fast Inference

OmniPainter can generate high-quality images that match a prompt and a style reference image in just 4 to 6 timesteps. It uses the self-consistency property of latent consistency models to ensure the results closely align with the style of the reference image.

OmniPainter examples

LayerPeeler: Autoregressive Peeling for Layer-wise Image Vectorization

LayerPeeler can remove hidden layers from images and create vector graphics with clear paths and organized layers.

LayerPeeler examples

TF-SA: Training Free Stylized Abstraction

Training Free Stylized Abstraction can generate stylized abstractions from a single image without fine-tuning. It keeps important features while allowing for artistic changes and achieves high-quality results across various styles.

TF-SA examples

ObjectClear: Complete Object Removal via Object-Effect Attention

ObjectClear can remove objects from images while also getting rid of shadows and reflections. It uses an object-effect attention mechanism to improve how well it separates the foreground from the background, making it better than other methods in tricky situations.

ObjectClear example

SAIL: Self-supervised Albedo Estimation from Real Images with a Latent Diffusion Model

SAIL can extract high-quality albedo maps from single images. It allows for realistic lighting changes and scene editing without needing labeled data.

SAIL examples

Video

MiniMax-Remover: Taming Bad Noise Helps Video Object Removal

MiniMax-Remover can remove objects from videos efficiently with just 6 sampling steps.

MiniMax-Remover example

EPiC: Efficient Video Camera Control Learning with Precise Anchor-Video Guidance

EPiC can control video cameras in image-to-video and video-to-video tasks without needing many camera path details.

EPiC examples

DualParal: Minute-Long Videos with Dual Parallelisms

DualParal can generate minute-long videos.

DualParal example

TalkingMachines: Real-Time Audio-Driven FaceTime-Style Video via Autoregressive Diffusion Models

TalkingMachines can turn a video model into a real-time, audio-driven avatar animator for FaceTime-style chats. It allows users to switch between speaking and listening modes, enabling endless conversations with popular audio large language models.

TalkingMachines example

Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation

MultiTalk can generate videos of multiple people talking by using audio from different sources, a reference image, and a prompt.

Let Them Talk example

DiffPhy: Think Before You Diffuse: LLMs-Guided Physics-Aware Video Generation

DiffPhy can generate realistic videos that follow the laws of physics.

DiffPhy example

OmniSync: Towards Universal Lip Synchronization via Diffusion Transformers

OmniSync can achieve universal lip synchronization in videos without needing reference frames or masks.

OmniSync examples

Any-to-Bokeh: One-Step Video Bokeh via Multi-Plane Image Guided Diffusion

Any-to-Bokeh can turn videos into bokeh effects that show depth and focus.

Any-to-Bokeh example

Facial Motion Timeline Control: Exploring Timeline Control for Facial Motion Generation

Facial Motion Timeline Control can generate natural facial motions with precise timing using a multi-track timeline control system.

Exploring Timeline Control for Facial Motion Generation example

Enjoy the weekend my fellow dreamers! Chromoglint style is now on Promptcache.

And that my fellow dreamers, concludes yet another AI Art weekly issue. If you like what I do, you can support me by:

  • Sharing it πŸ™β€οΈ
  • Following me on Twitter: @dreamingtulpa
  • Buying me a coffee (I could seriously use it, putting these issues together takes me 8-12 hours every Friday πŸ˜…)
  • Buying my Midjourney prompt collection on PROMPTCACHE πŸš€
  • Buying access to AI Art Weekly Premium πŸ‘‘

Reply to this email if you have any feedback or ideas for this newsletter.

Thanks for reading and talk to you next week!

– dreamingtulpa

by @dreamingtulpa