AI Art Weekly #89

Hello there, my fellow dreamers, and welcome to issue #89 of AI Art Weekly! πŸ‘‹

I’ve spent a lot of time this week optimizing and automating the process of putting these issues together. Which means: more time to build cool stuff! The best thing about AI / computer vision art is its wonderful community full of curious and creative minds. So my current goal is to build more community centric things. I’m super open to ideas, just reply to this email if you have some πŸ™

In this issue:

  • 3D: EVA, Meta 3D AssetGen, HouseCrafter
  • Image: MIGC++, StyleShot, Magic Insert, LLM4GEN
  • Video: LivePortrait, DiffIR2VR-Zero, DIRECTOR, MimicMotion, CAVIS
  • Audio: FoleyCrafter, PicoAudio
  • and more!

Cover Challenge 🎨

Theme: boredom
85 submissions by 53 artists
AI Art Weekly Cover Art Challenge boredom submission by AleRVG
πŸ† 1st: @AleRVG
AI Art Weekly Cover Art Challenge boredom submission by pactalom
πŸ₯ˆ 2nd: @pactalom
AI Art Weekly Cover Art Challenge boredom submission by mamaralic
πŸ₯‰ 3rd: @mamaralic
AI Art Weekly Cover Art Challenge boredom submission by letture_
🧑 4th: @letture_

News & Papers

3D

Expressive Gaussian Human Avatars from Monocular RGB Video

EVA can generate expressive human avatars with detailed hand and facial animations from a single RGB video.

EVA examples

Meta 3D AssetGen: Text-to-Mesh Generation with High-Quality Geometry, Texture, and PBR Materials

Meta 3D AssetGen can generate high-quality meshes from text or images and supports texture and material control.

Meta 3D AssetGen examples

HouseCrafter: Lifting Floorplans to 3D Scenes with 2D Diffusion Model

Want to see what your next flat, house or film set could look like in 3D? HouseCrafter can lift a floorplan into a complete 3D indoor scene.

HouseCrafter example

Image

MIGC++: Advanced Multi-Instance Generation Controller for Image Synthesis

MIGC++ is a plug-and-play controller that enables Stable Diffusion with precise position control while ensuring the correctness of various attributes like color, shape, material, texture, and style. It can also control the number of instances and improve interaction between instances.

MIGC++ example

StyleShot: A Snapshot on Any Style

StyleShot can mimic and style transfer various styles from an image, such as 3D, flat, abstract or even fine-grained styles, without tuning.

StyleShot examples

Magic Insert: Style-Aware Drag-and-Drop

Magic Insert can drag-and-drop subjects from one image into another image while matching the style of the target image.

Magic Insert examples

LLM4GEN: Leveraging Semantic Representation of LLMs for Text-to-Image Generation

LLM4GEN enhances the semantic understanding ability of text-to-image diffusion models by leveraging the semantic representation of LLMs. Meaning: More complex and dense prompts that involve multiple objects, attribute binding, and long descriptions.

LLM4GEN examples

Video

LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control

LivePortrait can animate a single source image with motion from a driving video. The method is able to generate high-quality videos at 60fps and is able to retarget the motion to other characters.

LivePortrait examples

DiffIR2VR-Zero: Zero-Shot Video Restoration with Diffusion-based Image Restoration Models

DiffIR2VR-Zero is a zero-shot video restoration method that can be used with any 2D image restoration diffusion model. The method is able to do 8x super-resolution and high-standard deviation video denoising.

DiffIR2VR-Zero example

E.T. the Exceptional Trajectories: Text-to-camera-trajectory generation with character awareness

DIRECTOR can generate complex camera trajectories from text that describe the relation and synchronization between the camera and characters.

DIRECTOR example

MimicMotion: High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance

MimicMotion can generate high-quality videos of arbitrary length mimicking specific motion guidance. The method is able to produce videos of up to 10,000 frames with acceptable resource consumption.

MimicMotion example

Context-Aware Video Instance Segmentation

CAVIS can do instance segmentation on videos. It’s able to better track objects and improve instance matching accuracy, resulting in more accurate and stable instance segmentation.

CAVIS example

Audio

FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds

FoleyCrafter can generate high-quality sound effects for videos! Results aim to be semantically relevant and temporally synchronized with a video. It also supports text prompts to better control the video-to-audio generation.

Check the project page for some insane examples!

PicoAudio: Enabling Precise Timestamp and Frequency Controllability of Audio Events in Text-to-audio Generation

PicoAudio is a temporal controlled audio generation framework. The model is able to generate audio with precise timestamp and occurrence frequency control.

Check the project page for audio examples

Also interesting

β€œI’m swirlwinded” by me.

And that my fellow dreamers, concludes yet another AI Art weekly issue. Please consider supporting this newsletter by:

  • Sharing it πŸ™β€οΈ
  • Following me on Twitter: @dreamingtulpa
  • Buying me a coffee (I could seriously use it, putting these issues together takes me 8-12 hours every Friday πŸ˜…)
  • Buying a physical art print to hang onto your wall

Reply to this email if you have any feedback or ideas for this newsletter.

Thanks for reading and talk to you next week!

– dreamingtulpa

by @dreamingtulpa