AI Art Weekly #89
Hello there, my fellow dreamers, and welcome to issue #89 of AI Art Weekly! π
Iβve spent a lot of time this week optimizing and automating the process of putting these issues together. Which means: more time to build cool stuff! The best thing about AI / computer vision art is its wonderful community full of curious and creative minds. So my current goal is to build more community centric things. Iβm super open to ideas, just reply to this email if you have some π
In this issue:
- 3D: EVA, Meta 3D AssetGen, HouseCrafter
- Image: MIGC++, StyleShot, Magic Insert, LLM4GEN
- Video: LivePortrait, DiffIR2VR-Zero, DIRECTOR, MimicMotion, CAVIS
- Audio: FoleyCrafter, PicoAudio
- and more!
Want me to keep up with AI for you? Well, that requires a lot of coffee. If you like what I do, please consider buying me a cup so I can stay awake and keep doing what I do π
Cover Challenge π¨
For the next cover Iβm looking for hybrid submissions! Reward is again $50 and a rare role in our Discord community which lets you vote in the finals. Rulebook can be found here and images can be submitted here.
News & Papers
3D
Expressive Gaussian Human Avatars from Monocular RGB Video
EVA can generate expressive human avatars with detailed hand and facial animations from a single RGB video.
Meta 3D AssetGen: Text-to-Mesh Generation with High-Quality Geometry, Texture, and PBR Materials
Meta 3D AssetGen can generate high-quality meshes from text or images and supports texture and material control.
HouseCrafter: Lifting Floorplans to 3D Scenes with 2D Diffusion Model
Want to see what your next flat, house or film set could look like in 3D? HouseCrafter can lift a floorplan into a complete 3D indoor scene.
Image
MIGC++: Advanced Multi-Instance Generation Controller for Image Synthesis
MIGC++ is a plug-and-play controller that enables Stable Diffusion with precise position control while ensuring the correctness of various attributes like color, shape, material, texture, and style. It can also control the number of instances and improve interaction between instances.
StyleShot: A Snapshot on Any Style
StyleShot can mimic and style transfer various styles from an image, such as 3D, flat, abstract or even fine-grained styles, without tuning.
Magic Insert: Style-Aware Drag-and-Drop
Magic Insert can drag-and-drop subjects from one image into another image while matching the style of the target image.
LLM4GEN: Leveraging Semantic Representation of LLMs for Text-to-Image Generation
LLM4GEN enhances the semantic understanding ability of text-to-image diffusion models by leveraging the semantic representation of LLMs. Meaning: More complex and dense prompts that involve multiple objects, attribute binding, and long descriptions.
Video
LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control
LivePortrait can animate a single source image with motion from a driving video. The method is able to generate high-quality videos at 60fps and is able to retarget the motion to other characters.
DiffIR2VR-Zero: Zero-Shot Video Restoration with Diffusion-based Image Restoration Models
DiffIR2VR-Zero is a zero-shot video restoration method that can be used with any 2D image restoration diffusion model. The method is able to do 8x super-resolution and high-standard deviation video denoising.
E.T. the Exceptional Trajectories: Text-to-camera-trajectory generation with character awareness
DIRECTOR can generate complex camera trajectories from text that describe the relation and synchronization between the camera and characters.
MimicMotion: High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance
MimicMotion can generate high-quality videos of arbitrary length mimicking specific motion guidance. The method is able to produce videos of up to 10,000 frames with acceptable resource consumption.
Context-Aware Video Instance Segmentation
CAVIS can do instance segmentation on videos. Itβs able to better track objects and improve instance matching accuracy, resulting in more accurate and stable instance segmentation.
Audio
FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds
FoleyCrafter can generate high-quality sound effects for videos! Results aim to be semantically relevant and temporally synchronized with a video. It also supports text prompts to better control the video-to-audio generation.
PicoAudio: Enabling Precise Timestamp and Frequency Controllability of Audio Events in Text-to-audio Generation
PicoAudio is a temporal controlled audio generation framework. The model is able to generate audio with precise timestamp and occurrence frequency control.
Also interesting
The 7th Claire Silver AI contest results are finally out. @ainslie7 created this beautiful short documentary around exploring the possibility of using AI as an art therapist and in making imaginary things physical. A must watch imo.
@nptacek is creating a 2D-to-3D level editor with code generated by Claude Sonnet 3.5. Super cool!
One of the most impressive things about Runway Gen 3 is how well it understands and interprets prompts. Here is a miniature Zebra walking on a fingertip
by @jonaspeterson.
@dlostastronaut showed us that you can generate unlimited green screen assets for later post-production such as fire, rain or droplets using Gen-3.
@Martin_Haerlin has been experimenting with the timeless film style of Czech fairy tale movies in Gen-3.
@0xFramer created this beautiful clay animation using Viggle and Domo AI. He also shared a tutorial alongside it π
And that my fellow dreamers, concludes yet another AI Art weekly issue. Please consider supporting this newsletter by:
- Sharing it πβ€οΈ
- Following me on Twitter: @dreamingtulpa
- Buying me a coffee (I could seriously use it, putting these issues together takes me 8-12 hours every Friday π )
- Buying a physical art print to hang onto your wall
Reply to this email if you have any feedback or ideas for this newsletter.
Thanks for reading and talk to you next week!
β dreamingtulpa