AI Art Weekly #91
Hello there, my fellow dreamers, and welcome to issue #91 of AI Art Weekly! π
We finally crossed the 4000 subscriber milestone this week! I suck at growing things, so this is a big deal for me :) πππ My goal is to reach 10k until the end of the year, so one can still dream π Thank you everybody who is with me on this journey πβ€οΈ
In this issue:
- 3D: Animate3D, Shape of Motion, DreamCatalyst, StyleSplat, Click-Gaussian, 3DWire
- Motion: SMooDi, GuidedMotion
- Image: AccDiffusion, LogoSticker, Lite2Relight, Kinety
- Video: Noise Calibration, ST-AVSR, Streetscapes, VD3D, IDOL, TCAN
- and more!
Want me to keep up with AI for you? Well, that requires a lot of coffee. If you like what I do, please consider buying me a cup so I can stay awake and keep doing what I do π
Cover Challenge π¨
For the next cover Iβm looking for golden hour submissions! Reward is again $50 and a rare role in our Discord community which lets you vote in the finals. Rulebook can be found here and images can be submitted here.
News & Papers
3D
Animate3D: Animating Any 3D Model with Multi-view Video Diffusion
Animate3D can animate any static multi-view 3D model.
Shape of Motion: 4D Reconstruction from a Single Video
Shape of Motion can reconstruct 3D scenes from a single video. The method is able to capture the full 3D motion of a scene and can handle occlusions and disocclusions.
DreamCatalyst: Fast and High-Quality 3D Editing via Controlling Editability and Identity Preservation
DreamCatalyst can edit NeRF scenes in only about 25 minutes or produce high-quality results in less than 70 minutes.
StyleSplat: 3D Object Style Transfer with Gaussian Splatting
StyleSplat can stylize 3D objects in scenes represented by 3D Gaussians from reference style images. The method is able to localize style transfer to specific objects and supports stylization with multiple styles.
Click-Gaussian: Interactive Segmentation to Any 3D Gaussians
Click-Gaussian can interactively segment and manipulate 3D Gaussian Splats in a scene. Objects can be selected by pointing and clicking, and then easily be resized, moved, or repainted.
Generating 3D House Wireframes with Semantics
3DWire can generate 3D house wireframes from text! The wireframes can be easily segmented into distinct components, such as walls, roofs, and rooms, reflecting the semantic essence of the shape.
Motion
SMooDi: Stylized Motion Diffusion Model
SMooDi can generate stylized motion from text prompts and style motion sequences.
Local Action-Guided Motion Diffusion Model for Text-to-Motion Generation
GuidedMotion is a text-to-motion technique that generates diverse global motions by using local actions as control signals. It can combine various local actions and adjust guiding weights, allowing for flexible user control over the generated motions.
Image
AccDiffusion: An Accurate Method for Higher-Resolution Image Generation
AccDiffusion can generate high-resolution images with fewer object repetition! Something Stable Diffusion has been plagued by since its infancy.
LogoSticker: Inserting Logos into Diffusion Models for Customized Generation
LogoSticker can insert logos into diffusion models and enable their seamless synthesis in varied contexts. The method is able to generate logos accurately and harmoniously in diverse contexts.
Lite2Relight: 3D-aware Single Image Portrait Relighting
Lite2Relight can relight human portraits while preserving 3D consistency and identity.
Kinetic Typography Diffusion Model
Kinety can generate kinetic typography videos with legible and artistic letter motions based on text prompts.
IMAGDressing-v1: Customizable Virtual Dressing
IMAGDressing-v1 can generate human try-on images from input garments. It is able to control different scenes through text and can be combined with IP-Adapter and ControlNet pose to enhance the diversity and controllability of generated images.
The Fabrication of Reality and Fantasy: Scene Generation with LLM-Assisted Prompt Interpretation
RFNet is a training-free approach that bring better prompt understanding to image generation. Adding support for prompt reasoning, conceptual and metaphorical thinking, imaginative scenarios and more.
Video
Noise Calibration: Plug-and-play Content-Preserving Video Enhancement using Pre-trained Video Diffusion Models
Similar to creative upscaling in images, Noise Calibration can improve the visual quality of videos while maintaining the structure of the input.
Arbitrary-Scale Video Super-Resolution with Structural and Textural Priors
ST-AVSR is a new video super-resolution method. It is able to upscale videos to arbitrary scales while preserving spatial detail and temporal consistency.
Streetscapes: Large-scale Consistent Street View Generation Using Autoregressive Video Diffusion
Streetscapes can generate long sequences of views through an on-the-fly synthesized city-scale scene, controlled by layout maps and text.
VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control
VD3D enables camera control for video diffusion models and can transfer the camera trajectory from a reference video.
IDOL: Unified Dual-Modal Latent Diffusion for Human-Centric Joint Video-Depth Generation
IDOL is yet another method to animate characters using an input image and pose guidance video. This one lets you put subjects into different backgrounds.
TCAN: Animating Human Images with Temporally Consistent Pose Guidance using Diffusion Models
TCAN can animate characters of various styles from a pose guidance video.
Audio
Masked Generative Video-to-Audio Transformers with Enhanced Synchronicity
MaskVAT can generate high-quality audio tracks for videos. The model is able to synchronize sound onsets with visual actions.
Also interesting
What if Windows XP had AI capabilities? @sawyerhood built Windows9x, a fictional OS which can generate any retro application you can think of.
RunwayMLs Gen-3 output is so coherent, it works incredibly well with current object tracking capabilities. This week Graeme Shepherd did some tests using Cinema4D, and the results look fantastic.
Itβs never been easier to turn yourself into a video game character! @toolstelegraph did it in under 2 minutes using Dzine AI and Tripo AI and shared a tutorial.
And that my fellow dreamers, concludes yet another AI Art weekly issue. Please consider supporting this newsletter by:
- Sharing it πβ€οΈ
- Following me on Twitter: @dreamingtulpa
- Buying me a coffee (I could seriously use it, putting these issues together takes me 8-12 hours every Friday π )
- Buying a physical art print to hang onto your wall
Reply to this email if you have any feedback or ideas for this newsletter.
Thanks for reading and talk to you next week!
β dreamingtulpa