AI Art Weekly #84

Hello there, my fellow dreamers, and welcome to issue #84 of AI Art Weekly! 👋

After scouring through 110+ papers through you this week, I bring you another chock-full issue of the latest and greatest AI news this week.

But before we jump in, a little heads up that there will be no issue next week as I’m going to get married 🤵‍♂️👰‍♀️. I’ll be back with a new issue on June 7th.

In this issue:

  • 3D: MirrorGaussian, GarmentDreamer, NOVA-3D
  • Motion: RemoCap, MagicPose4D, Semantic Gesticulator, CondMDI, SignLLM, SynCHMR
  • Images: Face Adapter, Images that Sound, DMD2, InstaDrag, EditWorld, RectifID
  • Video: FIFO-Diffusion, ReVideo, Slicedit, ViViD, MotionCraft, Generative Camera Dolly, MOFT
  • and more!

Cover Challenge 🎨

Theme: synesthesia
90 submissions by 58 artists
AI Art Weekly Cover Art Challenge synesthesia submission by pactalom
🏆 1st: @pactalom
AI Art Weekly Cover Art Challenge synesthesia submission by amorvobiscum
🥈 2nd: @amorvobiscum
AI Art Weekly Cover Art Challenge synesthesia submission by risugawa
🥉 3rd: @risugawa
AI Art Weekly Cover Art Challenge synesthesia submission by VikitoruFelipe
🥉 3rd: @VikitoruFelipe

News & Papers


MirrorGaussian: Reflecting 3D Gaussians for Reconstructing Mirror Reflections

MirrorGaussian is the first 3D Gaussian Splatting method that can reconstruct mirrors in real-time. It’s even able to add new mirrors and objects to existing scenes.

Placing a new object into a scene with a mirror

GarmentDreamer: 3DGS Guided Garment Synthesis with Diverse Geometry and Texture Details

GarmentDreamer can generate wearable, simulation-ready 3D garment meshes from text prompts. The method is able to generate diverse geometric and texture details, making it possible to create a wide range of different clothing items.

GarmentDreamer examples

NOVA-3D: Non-overlapped Views for 3D Anime Character Reconstruction

NOVA-3D can generate 3D anime characters from non-overlapped front and back views.

NOVA-3D examples


RemoCap: Disentangled Representation Learning for Motion Capture

RemoCap can reconstruct 3D human bodies from motion sequences. It’s able to capture occluded body parts with greater fidelity, resulting in less model penetration and distorted motion.

RemoCap example

MagicPose4D: Crafting Articulated Models with Appearance and Motion Control

MagicPose4D can generate 3D objects from text or images and transfer precise motions and trajectories from objects and characters in a video or mesh sequence.

MagicPose4D example

Semantic Gesticulator: Semantics-Aware Co-Speech Gesture Synthesis

Semantic Gesticulator can generate realistic gestures accompanying speech with strong semantic correspondence vital for effective communication.

Semantic Gesticulator example

CondMDI: Flexible Motion In-betweening with Diffusion Models

CondMDI can generate precise and diverse motions that conform to flexible user-specified spatial constraints and text descriptions. This enables the creation of high-quality animations from just text prompts and inpainting between keyframes.

CondMDI motion inpainting example

SignLLM: Sign Languages Production Large Language Models

SignLLM is the first multilingual Sign Language Production (SLP) model. It can generate sign language gestures from input text or prompts and achieve state-of-the-art performance on SLP tasks across eight sign languages.

SignLLM examples

SynCHMR: Synergistic Global-space Camera and Human Reconstruction from Videos

SynCHMR can reconstruct camera trajectory, human motion, and scene in one global coordinate from videos.

SynCHMR example


Face Adapter for Pre-Trained Diffusion Models with Fine-Grained ID and Attribute Control

Face Adapter is a new face swapping method that can generate facial detail and handle face shape changes with fine-grained control over attributes like identity, pose, and expression.

Face Adapter examples

Images that Sound: Composing Images and Sounds on a Single Canvas

Images that Sound can generate spectrograms that look like images but can also be played as sound.

Spectogram examples

DMD2: Improved Distribution Matching Distillation for Fast Image Synthesis

DMD2 is a new improved distillation method that can turn diffusion models into efficient one-step image generators.

DMD2 example generated in 4 steps

InstaDrag: Lightning Fast and Accurate Drag-based Image Editing Emerging from Videos

InstaDrag can do drag-based image editing in just one second. The method is trained on videos and is able to perform local shape deformations not presented in the training data, like lengthening hair or twisting rainbows.

InstaDrag example

EditWorld: Simulating World Dynamics for Instruction-Following Image Editing

EditWorld can simulate world dynamics and edit images based on instructions that are grounded in various world scenarios. The method is able to add, replace, delete, and move objects in images, as well as change their attributes and perform other operations.

EditWorld examples

RectifID: Personalizing Rectified Flow with Anchored Classifier Guidance

RectifID is yet another personalization method from user-provided reference images of human faces, live subjects, and certain objects for diffusion models.

RectifID examples


FIFO-Diffusion: Generating Infinite Videos from Text without Training

Infinitely long AI videos are coming! FIFO-Diffusion is a new inference method for existing text-to-video models like VideoCrafter2, Open-Sora-Plan and ZeroScope which makes it possible to generate infinitely long videos without any additional training!

5 second excerpt from a 51 second clip generated by VideoCrafter2 with the prompt An astronaut walking on the moon's surface, high-quality, 4K resolution.

ReVideo: Remake a Video with Motion and Content Control

Video editing is getting wild! ReVideo can change the content of a specific area while keeping the motion constant, customize new motion trajectories, or modify both content and motion trajectories.

ReVideo example

Slicedit: Zero-Shot Video Editing With Text-to-Image Diffusion Models Using Spatio-Temporal Slices

Slicedit can edit videos with a simple text prompt that retains the structure and motion of the original video while adhering to the target text.

Slicedit examples

ViViD: Video Virtual Try-on using Diffusion Models

ViViD can transfer a clothing item onto the video of a target person. The method is able to capture garment details and human posture, resulting in more coherent and lifelike videos.

ViViD example

MotionCraft: Zero-Shot Video Generation

MotionCraft can animate single images based on phsyics, resulting in videos that evolves more coherently from the first frame. The method is able to simulate different physics, such as fluid dynamics, rigid motion, and multi-agent systems, and can also be combined with animation software to generate the required optical flows.

MotionCraft example

Generative Camera Dolly

Generative Camera Dolly can regenerate a video from any chosen perspective. Still very early, but imagine being able to change any shot or angle in a video after it’s been recorded!

GCD driving scene completion example

MOFT: MOtion FeaTure

MOFT is a training-free video motion interpreter and controller. It can be used to extract motion information from video diffusion models and guide the motion of generated videos without the need for retraining.

MOFT example

Also interesting

  • MetaEarth: A Generative Foundation Model for Global-Scale Remote Sensing Image Generation
  • DoGaussian: Distributed-Oriented Gaussian Splatting
  • MOSS: Motion-based 3D Clothed Human Synthesis from Monocular Video

“Tainted Blossom III” by me.

And that my fellow dreamers, concludes yet another AI Art weekly issue. Please consider supporting this newsletter by:

  • Sharing it 🙏❤️
  • Following me on Twitter: @dreamingtulpa
  • Buying me a coffee (I could seriously use it, putting these issues together takes me 8-12 hours every Friday 😅)
  • Buying a physical art print to hang onto your wall

Reply to this email if you have any feedback or ideas for this newsletter.

Thanks for reading and talk to you next week!

– dreamingtulpa

by @dreamingtulpa