AI Art Weekly #98

Hello there, my fellow dreamers, and welcome to issue #98 of AI Art Weekly! 👋

As I mentioned last week, I’m working on a solution to keep track of no-code papers. As the first part of that project I’m releasing the new AI Toolbox directory. It’s a curated list of all papers with code that I’ve featured in this newsletter which I’ll be updating weekly. More soon on the no-code part of the project!

Also found the time to add more Midjourney styles to PROMPTCACHE and the library is now at 100+ styles! More to come 🚀

I’ll be away next week, so the next issue will be in two weeks. I hope you enjoy this one!


Cover Challenge 🎨

Theme: inversion
39 submissions by 26 artists
AI Art Weekly Cover Art Challenge inversion submission by CadizFFM
🏆 1st: @CadizFFM
AI Art Weekly Cover Art Challenge inversion submission by onchainsherpa
🥈 2nd: @onchainsherpa
AI Art Weekly Cover Art Challenge inversion submission by daidatep
🥈 2nd: @daidatep
AI Art Weekly Cover Art Challenge inversion submission by risugawa
🥉 3rd: @risugawa

News & Papers

3D

MaskedMimic: Unified Physics-Based Character Control Through Masked Motion Inpainting

MaskedMimic can generate diverse motions for interactive characters using a physics-based controller. It supports various inputs like keyframes and text, allowing for smooth transitions and adaptation to complex environments.

MaskedMimic example

WiLoR: End-to-end 3D Hand Localization and Reconstruction in-the-wild

WiLoR can localize and reconstruct multiple hands in real-time from single images. It achieves smooth 3D hand tracking with high accuracy, using a large dataset of over 2 million hand images.

WiLoR example

3DTopia-XL: Scaling High-quality 3D Asset Generation via Primitive Diffusion

3DTopia-XL can generate high-quality 3D PBR assets from text or image inputs in just 5 seconds.

3DTopia-XL examples

Architectural Co-LOD Generation

Architectural Co-LOD Generation can manage the level-of-detail (LOD) in architectural models by standardizing shapes across buildings. This method ensures high-quality details and consistency in both single models and collections.

Co-LOD example

DreamWaltz-G: Expressive 3D Gaussian Avatars from Skeleton-Guided 2D Diffusion

DreamWaltz-G can generate high-quality 3D avatars from text and animate them using SMPL-X motion sequences. It improves avatar consistency with Skeleton-guided Score Distillation and is useful for human video reenactment and creating scenes with multiple subjects.

DreamWaltz-G examples

UniHair: Towards Unified 3D Hair Reconstruction from Single-View Portraits

UniHair can create 3D hair models from single-view portraits, handling both braided and un-braided styles. It uses a large dataset and advanced techniques to accurately capture complex hairstyles and generalize well to real images.

UniHair examples

FlexiTex: Enhancing Texture Generation with Visual Guidance

FlexiTex can generate high-quality textures for 3D models by using visual guidance to add detail and consistency. It has features that preserve fine details and improve how textures look from different camera angles.

FlexiTex examples

Image

Prompt Sliders: Sliders for Fine-Grained Control, Editing and Erasing of Concepts in Diffusion Models

Prompt Sliders can control and edit concepts in diffusion models. It allows users to adjust the strength of concepts with just 3KB of storage per embedding, making it much faster than traditional LoRA methods.

Prompt Sliders examples

StoryMaker: Towards Holistic Consistent Characters in Text-to-image Generation

StoryMaker can generate a series of images with consistent characters across multiple images. It keeps the same facial features, clothing, hairstyles, and body types, allowing for cohesive storytelling.

StoryMaker examples

Reflecting Reality: Enabling Diffusion Models to Produce Faithful Mirror Reflections

Reflecting Reality can generate realistic mirror reflections using a method called MirrorFusion. It allows users to control mirror placement and achieves better reflection quality and geometry than other methods.

Reflecting Reality example

Video

PortraitGen: Portrait Video Editing Empowered by Multimodal Generative Priors

PortraitGen can edit portrait videos using multimodal prompts while keeping the video smooth and consistent. It renders over 100 frames per second and supports various styles like text-driven and relighting, ensuring high quality and temporal consistency.

PortraitGen examples

GMRW: Self-Supervised Any-Point Tracking by Contrastive Random Walks

Self-Supervised Any-Point Tracking by Contrastive Random Walks can track any point in a video using a self-supervised global matching transformer.

GMRW example

MIMO: Controllable Character Video Synthesis with Spatial Decomposed Modeling

MIMO can create controllable character videos from a single image. It allows users to animate characters with complex motions in real-world scenes by encoding 2D videos into 3D spatial codes for flexible control.

MIMO example

LVCD: Reference-based Lineart Video Colorization with Diffusion Models

LVCD can colorize lineart videos using a pretrained video diffusion model. It ensures smooth motion and high video quality by effectively transferring colors from reference frames.

LVCD examples

Skyeyes: Ground Roaming using Aerial View Images

Skyeyes can generate photorealistic sequences of ground view images from aerial view inputs. It ensures that the images are consistent and realistic, even when there are large gaps in views.

Skyeyes example

Audio

AudioEditor: A Training-Free Diffusion-Based Audio Editing Framework

AudioEditor can edit audio by adding, deleting, and replacing segments while keeping unedited parts intact. It uses a pretrained diffusion model with methods like Null-text Inversion and EOT-suppression to ensure high-quality results.

AudioEditor example

AVSoundscape: Self-Supervised Audio-Visual Soundscape Stylization

AVSoundscape can change how speech sounds by making it seem like it was recorded in a different scene. It uses examples from videos and a method called latent diffusion to effectively transfer sound properties, even with videos that don’t have labels.

AVSoundscape example

Also interesting

“👀” by me.

And that my fellow dreamers, concludes yet another AI Art weekly issue. Please consider supporting this newsletter by:

  • Sharing it 🙏❤️
  • Following me on Twitter: @dreamingtulpa
  • Buying me a coffee (I could seriously use it, putting these issues together takes me 8-12 hours every Friday 😅)
  • Buying my Midjourney prompt collection on PROMPTCACHE 🚀
  • Buying a print of my art from my art shop 🖼️

Reply to this email if you have any feedback or ideas for this newsletter.

Thanks for reading and talk to you next week!

– dreamingtulpa

by @dreamingtulpa