AI Art Weekly #68

Hello there, my fellow dreamers, and welcome to issue #68 of AI Art Weekly! ๐Ÿ‘‹

Iโ€™ve begun diving my teeth into a new project. Hopefully I can tell you all more about it next week ๐Ÿคซ But until then, lets check out this weeks AI art news!

  • Google announced Lumiere โ€“ a new video model
  • ActAnywhere generates video backgrounds
  • 3DHM animates people with 3D camera control
  • Depth Anything is a new monocular depth estimation method
  • pix2gestalt estimates and inpaints the shape and appearance of occluded objects
  • UltrAvatar generates realistic and animatable 3D avatars
  • Diffuse to Choose allows to virtually place any item in any setting
  • GALA splits single-layer clothed 3D human meshes into multi-layered 3D assets
  • CreativeSynth is a new SOTA method for style retention in image generation
  • Interview with AI power house AInigma
  • and more!

Cover Challenge ๐ŸŽจ

Theme: anima & animus
79 submissions by 50 artists
AI Art Weekly Cover Art Challenge anima & animus submission by YedaiArt
๐Ÿ† 1st: @YedaiArt
AI Art Weekly Cover Art Challenge anima & animus submission by webstark
๐Ÿฅˆ 2nd: @webstark
AI Art Weekly Cover Art Challenge anima & animus submission by visual_dose
๐Ÿฅˆ 2nd: @visual_dose
AI Art Weekly Cover Art Challenge anima & animus submission by VikitoruFelipe
๐Ÿงก 4th: @VikitoruFelipe

News & Papers

Lumiere: A Space-Time Diffusion Model for Video Generation

Lumiere is Googleโ€™s latest video model and it looks wild! The model was trained on a dataset of 30 million videos, along with their text captions, and is capable of generating 80 frames at 16 fps. It supports text-to-video, image-to-video, video inpainting and stylization. Unfortunately, Google has a track record of not releasing their models, but one can still hope ๐Ÿฅน

Lumiere inpainting example

ActAnywhere: Subject-Aware Video Background Generation

Given a subject sequence and a background image, ActAnywhere can generate video backgrounds that match the foreground motions. Pretty cool!

ActAnywhere example

3DHM: Synthezing Moving People with 3D Control

3DHM can animate people with 3D camera control from a single image and a given target video motion sequence.

3DHM example

Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

Depth Anything is a new monocular depth estimation method. The model is trained on 1.5M labeled images and 62M+ unlabeled images, which results in impressive generalization ability.

Depth Anything comparison with MiDaS

pix2gestalt: Amodal Segmentation by Synthesizing Wholes

pix2gestalt is able to estimate the shape and appearance of whole objects that are only partially visible behind occlusions.

pix2gestalt demo

UltrAvatar: A Realistic Animatable 3D Avatar Diffusion Model with Authenticity Guided Textures

UltrAvatar can generate realistic and animatable 3D avatars with PBR textures from a text prompt or a single image. The framework is also capable of texture editing, allowing you to change eye and hair colors, add aging effects, and even tattoos to your avatars.

UltrAvatar example

Diffuse to Choose: Enriching Image Conditioned Inpainting in Latent Diffusion Models for Virtual Try-All

Shooting product images for e-commerce shops is a time intensive task, but Diffuse to Choose by Amazon can help with that. The inpainting model allows to virtually place any item in any setting with detailed and semantically coherent blending as well as realistic lighting and shadows.

Diffuse to Choose example

GALA: Generating Animatable Layered Assets from a Single Scan

GALA can turn a single-layer clothed 3D human mesh and decompose it into complete multi-layered 3D assets. The outputs can then be combined with other assets to create new clothed human avatars with any pose.

GALA example

CreativeSynth: Creative Blending and Synthesis of Visual Arts based on Multimodal Diffusion

Retaining the style of a reference image when editing and blending images is a continued challenge. CreativeSynth outperforms other methods in those tasks. Itโ€™s not open-source yet, but the example images look promising.

CreativeSynth examples

Also interesting

  • DITTO: Diffusion Inference-Time T-Optimization for Music Generation
  • Hourglass Diffusion Transformers: Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers
  • RL Diffusion: Large-scale Reinforcement Learning for Diffusion Models
  • GenMoStyle: Generative Human Motion Stylization in Latent Space
  • EmerDiff: Emerging Pixel-level Semantic Knowledge in Diffusion Models


This week weโ€™re talking to the AI power house that is AInigma ๐Ÿ”ฅ

Tools & Tutorials

These are some of the most interesting resources Iโ€™ve come across this week.

โ€œGazing inwardsโ€ by me available on objkt

And that my fellow dreamers, concludes yet another AI Art weekly issue. Please consider supporting this newsletter by:

  • Sharing it ๐Ÿ™โค๏ธ
  • Following me on Twitter: @dreamingtulpa
  • Buying me a coffee (I could seriously use it, putting these issues together takes me 8-12 hours every Friday ๐Ÿ˜…)
  • Buying a physical art print to hang onto your wall

Reply to this email if you have any feedback or ideas for this newsletter.

Thanks for reading and talk to you next week!

โ€“ dreamingtulpa

by @dreamingtulpa