AI Art Weekly #91

Hello there, my fellow dreamers, and welcome to issue #91 of AI Art Weekly! πŸ‘‹

We finally crossed the 4000 subscriber milestone this week! I suck at growing things, so this is a big deal for me :) πŸŽ‰πŸŽ‰πŸŽ‰ My goal is to reach 10k until the end of the year, so one can still dream πŸ˜… Thank you everybody who is with me on this journey πŸ™β€οΈ

In this issue:

  • 3D: Animate3D, Shape of Motion, DreamCatalyst, StyleSplat, Click-Gaussian, 3DWire
  • Motion: SMooDi, GuidedMotion
  • Image: AccDiffusion, LogoSticker, Lite2Relight, Kinety
  • Video: Noise Calibration, ST-AVSR, Streetscapes, VD3D, IDOL, TCAN
  • and more!

Cover Challenge 🎨

Theme: 404 - not found
46 submissions by 28 artists
AI Art Weekly Cover Art Challenge 404 - not found submission by EternalSunrise7
πŸ† 1st: @EternalSunrise7
AI Art Weekly Cover Art Challenge 404 - not found submission by beholdthe84
πŸ₯ˆ 2nd: @beholdthe84
AI Art Weekly Cover Art Challenge 404 - not found submission by WhiteSolitude22
πŸ₯‰ 3rd: @WhiteSolitude22
AI Art Weekly Cover Art Challenge 404 - not found submission by CurlyP139
🧑 4th: @CurlyP139

News & Papers

3D

Animate3D: Animating Any 3D Model with Multi-view Video Diffusion

Animate3D can animate any static multi-view 3D model.

Animate3D examples

Shape of Motion: 4D Reconstruction from a Single Video

Shape of Motion can reconstruct 3D scenes from a single video. The method is able to capture the full 3D motion of a scene and can handle occlusions and disocclusions.

Shape of Motion example

DreamCatalyst: Fast and High-Quality 3D Editing via Controlling Editability and Identity Preservation

DreamCatalyst can edit NeRF scenes in only about 25 minutes or produce high-quality results in less than 70 minutes.

DreamCatalyst example

StyleSplat: 3D Object Style Transfer with Gaussian Splatting

StyleSplat can stylize 3D objects in scenes represented by 3D Gaussians from reference style images. The method is able to localize style transfer to specific objects and supports stylization with multiple styles.

StyleSplat example

Click-Gaussian: Interactive Segmentation to Any 3D Gaussians

Click-Gaussian can interactively segment and manipulate 3D Gaussian Splats in a scene. Objects can be selected by pointing and clicking, and then easily be resized, moved, or repainted.

Click-Gaussian example

Generating 3D House Wireframes with Semantics

3DWire can generate 3D house wireframes from text! The wireframes can be easily segmented into distinct components, such as walls, roofs, and rooms, reflecting the semantic essence of the shape.

3DWire example

Motion

SMooDi: Stylized Motion Diffusion Model

SMooDi can generate stylized motion from text prompts and style motion sequences.

SMooDi examples

Local Action-Guided Motion Diffusion Model for Text-to-Motion Generation

GuidedMotion is a text-to-motion technique that generates diverse global motions by using local actions as control signals. It can combine various local actions and adjust guiding weights, allowing for flexible user control over the generated motions.

GuidedMotion examples

Image

AccDiffusion: An Accurate Method for Higher-Resolution Image Generation

AccDiffusion can generate high-resolution images with fewer object repetition! Something Stable Diffusion has been plagued by since its infancy.

AccDiffusion examples

LogoSticker: Inserting Logos into Diffusion Models for Customized Generation

LogoSticker can insert logos into diffusion models and enable their seamless synthesis in varied contexts. The method is able to generate logos accurately and harmoniously in diverse contexts.

LogoSticker examples

Lite2Relight: 3D-aware Single Image Portrait Relighting

Lite2Relight can relight human portraits while preserving 3D consistency and identity.

Lite2Relight examples

Kinetic Typography Diffusion Model

Kinety can generate kinetic typography videos with legible and artistic letter motions based on text prompts.

Kinety example

IMAGDressing-v1: Customizable Virtual Dressing

IMAGDressing-v1 can generate human try-on images from input garments. It is able to control different scenes through text and can be combined with IP-Adapter and ControlNet pose to enhance the diversity and controllability of generated images.

IMAGDressing-v1 examples

The Fabrication of Reality and Fantasy: Scene Generation with LLM-Assisted Prompt Interpretation

RFNet is a training-free approach that bring better prompt understanding to image generation. Adding support for prompt reasoning, conceptual and metaphorical thinking, imaginative scenarios and more.

RFNet examples

Video

Noise Calibration: Plug-and-play Content-Preserving Video Enhancement using Pre-trained Video Diffusion Models

Similar to creative upscaling in images, Noise Calibration can improve the visual quality of videos while maintaining the structure of the input.

Noise Calibration example

Arbitrary-Scale Video Super-Resolution with Structural and Textural Priors

ST-AVSR is a new video super-resolution method. It is able to upscale videos to arbitrary scales while preserving spatial detail and temporal consistency.

ST-AVSR comparison

Streetscapes: Large-scale Consistent Street View Generation Using Autoregressive Video Diffusion

Streetscapes can generate long sequences of views through an on-the-fly synthesized city-scale scene, controlled by layout maps and text.

Streetscapes examples

VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control

VD3D enables camera control for video diffusion models and can transfer the camera trajectory from a reference video.

VD3D examples

IDOL: Unified Dual-Modal Latent Diffusion for Human-Centric Joint Video-Depth Generation

IDOL is yet another method to animate characters using an input image and pose guidance video. This one lets you put subjects into different backgrounds.

IDOL examples

TCAN: Animating Human Images with Temporally Consistent Pose Guidance using Diffusion Models

TCAN can animate characters of various styles from a pose guidance video.

TCAN examples

Audio

Masked Generative Video-to-Audio Transformers with Enhanced Synchronicity

MaskVAT can generate high-quality audio tracks for videos. The model is able to synchronize sound onsets with visual actions.

Check out the project page for MaskVAT examples with sound

Also interesting

β€œTomorrow we Float” by me.

And that my fellow dreamers, concludes yet another AI Art weekly issue. Please consider supporting this newsletter by:

  • Sharing it πŸ™β€οΈ
  • Following me on Twitter: @dreamingtulpa
  • Buying me a coffee (I could seriously use it, putting these issues together takes me 8-12 hours every Friday πŸ˜…)
  • Buying a physical art print to hang onto your wall

Reply to this email if you have any feedback or ideas for this newsletter.

Thanks for reading and talk to you next week!

– dreamingtulpa

by @dreamingtulpa