AI Art Weekly #62

Hello there, my fellow dreamers, and welcome to issue #62 of AI Art Weekly! 👋

The amount of papers per week is still increasing and I’ve skimmed through 196 this week and it’s getting harder and harder to keep up. Luckily for you, I’m here to serve 😉. So, let’s look at the highlights this week:

  • Higen-T2V video model
  • Marigold: A new state-of-the art depth estimation model
  • StyleCrafter generates images and videos based on style references
  • MotionCtrl controls camera and object motions in videos
  • and 9 other video editing methods
  • X-Adapter makes SD1.5 plugins work with SDXL
  • AnimateZero - the next AnimateDiff?
  • Readout Guidance - the next ControlNet?
  • PhotoMaker: Realistic photos from reference images
  • Generative Rendering turns mesh animations into videos
  • WonderJourney lets you walk through paintings
  • Doodle Your 3D turns sketches into 3D models
  • AnimatableDreamer extracts motions from videos
  • CLIPDrawX generates vector sketches
  • AmbiGen creates text amigrams
  • and more tutorials, tools and gems!

Cover Challenge 🎨

Theme: eggs
116 submissions by 65 artists
AI Art Weekly Cover Art Challenge eggs submission by CranialOrigami
🏆 1st: @CranialOrigami
AI Art Weekly Cover Art Challenge eggs submission by Daftwerk_
🥈 2nd: @Daftwerk_
AI Art Weekly Cover Art Challenge eggs submission by CurlyP139
🥉 3rd: @CurlyP139
AI Art Weekly Cover Art Challenge eggs submission by chetbff
🧡 4th: @chetbff

News & Papers

Higen-T2V: A New Image-to-Image Translation Model

We’ve seen a lot of video generation models in the past few months, so it gets harder and harder to impressed by the progress. Higen is yet another one, but it’s results are worth sharing! As usual not open-source yet, but worth to keep and eye on.

A video of a duckling wearing a medieval soldier helmet and riding a skateboard.

StyleCrafter: Enhancing Stylized Text-to-Video Generation with Style Adapter

Given one or more style references, StyleCrafter can generate images and videos based on these referenced styles.

StyleCrafter examples

MotionCtrl: A Unified and Flexible Motion Controller for Video Generation

MotionCtrl is a flexible motion controller that is able to manage both camera and object motions in the generated videos and can be used with VideoCrafter1,AnimateDiff Stable Video Diffusion.

MotionCtrl examples

Video editing is going BRRRRR

Usually there is like one or two video editing methods that come out each week that look kind of interesting. With the two above, we had 11 this week! The other 9 all have one interesting aspect about them that I wanted to highlight:

  • RAVE: Impressive consistent video editing.
  • Drag-A-Video: Drag and drop video editing.
  • BIVDiff: ControlNet support.
  • VMC, VideoSwap & SAVE: Swap subjects in videos.
  • MagicStick: Subject scaling, repositioning and human motion editing.
  • AVID: Any-length video inpainting.
  • FACTOR: Trajectory and appearance control.

RAVE examples

X-Adapter: Adding Universal Compatibility of Plugins for Upgraded Diffusion Model

Good news for SD1.5 enthusiasts! X-Adapter enables pretrained models and plugins for Stable Diffusion 1.5 (ControlNet, LoRAs, etc) to work directly with SDXL without any retraining 🥳

X-Adapter supports all kinds of SD1.5 plugins

AnimateZero: Video Diffusion Models are Zero-Shot Image Animators

AnimateZero looks like the next iteration of AnimateDiff. Like AnimateDiff, the method can generate videos from a single image using text prompts and supports video editing, frame interpolation, looped video generation and real image animation.

LivePhoto and DreamVideo are two other methods from this week that can animate images from text prompts.

AnimateZero example

Readout Guidance: Learning Control from Diffusion Features

Readout Heads look like the next evolution of guided image generation. Similar to ControlNet, they can be used for pose, depth, or edge-guided generations. But compared to ControlNet models, they’re much more lightweight. In the case of SDXL, a readout head requires at most 35MB of space and can be trained with as few as 100 paired examples. Additionally, they also support image manipulation based on drag and identity consistency. Very cool!

Readout Guidance Identity Consistency example

PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding

PhotoMaker can generate realistic human photos from input images and text prompts. It can change attributes of people, like changing hair colour and adding glasses, turn people from artworks like Van Gogh’s self-portrait into realistic photos, or mix identities of multiple people. Super pumped for this one!

PhotoMaker examples

Generative Rendering: Controllable 4D-Guided Video Generation with 2D Diffusion Models

Generative Rendering is able to take an animated, low-fidelity rendered mesh and a text prompt as input and generate a stylized video based on it. The results, while flickery, have something unique to them.

Generative Rendering examples

WonderJourney: Going from Anywhere to Everywhere

WonderJourney lets you wander through your favourite paintings, peoms and haikus. The method can generate a sequence of diverse yet coherently connected 3D scenes from a single image or text prompt.

WonderJourney example. This GIF doesn’t do it justice, check out the videos on the project page.

Doodle Your 3D: From Abstract Freehand Sketches to Precise 3D Shapes

Doodle Your 3D can turn abstract sketches into precise 3D shapes. The method can even edit shapes by simply editing the sketch. Super cool. Sketch-to-3D-print isn’t that far away now.

Doodle Your 3D examples

AnimatableDreamer: Text-Guided Non-rigid 3D Model Generation and Reconstruction with Canonical Score Distillation

AnimatableDreamer can create animated 3D objects from text prompts and animate them with motions extracted from monocular videos.

A squirrel with red sweater. with motions from a squirrel video

Relightable Gaussian Codec Avatars

Meta showcased Relightable Gaussian Codec Avatars this week. The method can build high-fidelity relightable & animatable head avatars that are able to capture 3D-consistent sub-millimeter details such as hair strands and pores on dynamic face sequences and supports diverse materials of human heads such as the eyes, skin, and hair in a unified manner. The avatars can be efficiently relit in real-time under both point light and continuous illumination.

The open-source alternative to this would be MonoGaussianAvatar.

Relightable Gaussian Codec Avatars examples

CLIPDrawX: Primitive-based Explanations for Text Guided Sketch Synthesis

CLIPDrawX can generate vector sketches from text prompts using simple primitive shapes like circles, straight lines, and semi-circles.

CLIPDrawX example

AmbiGen: Generating Ambigrams from Pre-trained Diffusion Model

AmbiGen on the other hand is a new method for generating ambigrams. Ambigrams are calligraphic designs that have different meanings depending on the viewing orientation.

AmbiGen examples

More papers & gems

  • Cartoon Segmentation: Instance-guided Cartoon Editing with a Large-scale Dataset
  • NeRFiller: Completing Scenes via Generative 3D Inpainting
  • Gaussian Grouping: Segment and Edit Anything in 3D Scenes
  • Feature 3DGS 🪄: Supercharging 3D Gaussian Splatting to Enable Distilled Feature Fields
  • HyperDreamer: Hyper-Realistic 3D Content Generation and Editing from a Single Image
  • ImageDream: Image-Prompt Multi-view Diffusion for 3D Generation
  • FaceStudio: Put Your Face Everywhere in Seconds

Tools & Tutorials

These are some of the most interesting resources I’ve come across this week.

😴😴🥱😴” by me available on objkt. Featured in “A retrospective of collected AI pieces in 2023” by @aicollection_.

And that my fellow dreamers, concludes yet another AI Art weekly issue. Please consider supporting this newsletter by:

  • Sharing it 🙏❤️
  • Following me on Twitter: @dreamingtulpa
  • Buying me a coffee (I could seriously use it, putting these issues together takes me 8-12 hours every Friday 😅)
  • Buying a physical art print to hang onto your wall

Reply to this email if you have any feedback or ideas for this newsletter.

Thanks for reading and talk to you next week!

– dreamingtulpa

by @dreamingtulpa