AI Art Weekly #77

Hello there, my fellow dreamers, and welcome to issue #77 of AI Art Weekly! πŸ‘‹

Another wild week in AI world is behind us. Grok 1.5 is on the horizon, more fluid deep fakes are coming, the robots are evolving and passed away loved ones have been brought back to life.

On top of that, I’ve gone through another 150+ papers for you this week. Happy Easter to those who celebrate it 🐰πŸ₯š

In this issue:

  • 3D generation: ThemeStation, DreamPolisher, TC4D, MonoHair, GaussianCube
  • Texture generation: Garment3DGen, Make-It-Vivid
  • Human Pose & Motion: TRAM, AiOS
  • Image generation and editing: FlashFace, PAID, NeuroPictor, ObjectDrop, Inclusion Matching, Attribute Control
  • Video generation and editing: Champ, TRIP, AniPortrait, StreamingT2V, Spectral Motion Alignment
  • and more!

Cover Challenge 🎨

Theme: ostara
62 submissions by 42 artists
AI Art Weekly Cover Art Challenge ostara submission by samisantosai
πŸ† 1st: @samisantosai
AI Art Weekly Cover Art Challenge ostara submission by moon__theater
πŸ₯ˆ 2nd: @moon__theater
AI Art Weekly Cover Art Challenge ostara submission by NomadsVagabonds
πŸ₯‰ 3rd: @NomadsVagabonds
AI Art Weekly Cover Art Challenge ostara submission by VikitoruFelipe
🧑 4th: @VikitoruFelipe

News & Papers

3D generation

ThemeStation: Generating Theme-Aware 3D Assets from Few Exemplars

Want a diverse set of trees, cars, or chairs for your 3D environment? ThemeStation can generate multiple variations from one or more similar 3D reference assets. The method also allows for editing 3D assets using text prompts.

ThemeStation can generate a gallery of theme-consistent 3D assets from few exemplars.

DreamPolisher: Towards High-Quality Text-to-3D Generation via Geometric Diffusion

DreamPolisher is yet another text-to-3D method. This one uses Gaussian Splats and ControlNet to generate high-quality and view-consistent 3D objects from text only.

DreamPolisher examples

TC4D: Trajectory-Conditioned Text-to-4D Generation

TC4D can animate 3D scenes generated from text along arbitrary trajectories. I can see this being useful for generating 3D effects for movies or games.

TC4D examples

MonoHair: High-Fidelity Hair Modeling from a Monocular Video

MonoHair can reconstruct high-fidelity 3D hair from a single video. Very impressive results and the method is able to handle a wide range of hair types and styles.

MonoHair example

GaussianCube: Structuring Gaussian Splatting using Optimal Transport for 3D Generative Modeling

GaussianCube is a image-to-3D model that is able to generate high-quality 3D objects from multi-view images. This one also uses 3D Gaussian Splatting, converts the unstructured representation into a structured voxel grid, and then trains a 3D diffusion model to generate new objects.

GaussianCube examples

Texture generation

Garment3DGen: 3D Garment Stylization and Texture Generation

Garment3DGen can stylize the geometry and textures from 2D image and 3D mesh garments! These can be fitted on top of parametric bodies and simulated. Could be used for hand-garment interaction in VR or to turn sketches into 3D garments.

Sketch-to-3D Garment example

Make-It-Vivid: Dressing Your Animatable Biped Cartoon Characters from Text

Make-It-Vivid generates high-quality texture maps for 3D biped cartoon characters from text instructions, making it possible to dress and animate characters based on prompts.

Make-It-Vivid textured characters doing the Fortnite dance

Human Pose & Motion

TRAM: Global Trajectory and Motion of 3D Humans from in-the-wild Videos

Ditch the expensive motion capture suits and cameras, TRAM can reconstruct one or multiple humans from monocular videos.

TRAM reconstructing 3D human motion of two dancers from a video

AiOS: All-in-One-Stage Expressive Human Pose and Shape Estimation

TRAM isn’t the only motion capture method this week though. AiOS is another one that can estimate the human body, hand, and facial expressions in a single step, resulting in more accurate and complete 3D reconstructions of people.

La La Land AiOS example

Image generation and editing

FlashFace

Another LoRA contender for identity preservation enters the game: FlashFace. Based on one or a few reference face images and a text prompt, the method can change the age or gender of a person, turn virtual characters into real people, make real people into artworks, and swap faces while retaining facial details to a high degree.

FlashFace examples

PAID is a method that enables smooth high consistency image interpolation for diffusion models. GANs have been the king in that field so far, but this method shows promising results for diffusion models.

PAID interpolation example

NeuroPictor: Refining fMRI-to-Image Reconstruction via Multi-individual Pretraining and Multi-level Modulation

It’s been a while since we heard anything about 🧠-to-image. NeuroPictor is a new method for fMRI-to-image reconstruction. With Neuralink on the horizon, might not be long now until we can visualize our dreams 🀯

NeuroPictor examples

ObjectDrop: Bootstrapping Counterfactuals for Photorealistic Object Removal and Insertion

Google’s ObjectDrop enables photorealistic object removal and insertion while considering their effects on the scene. Buuut, it’s from Google. So, you know, probably will never get into the hands of us normies :(

ObjectDrop examples

Inclusion Matching for Animation Paint Bucket Colorization

Paint bucket colorization just got so much easier. Inclusion Matching can colorize line art in animations automatically. The technique requires painters to colorize just one frame, after which the algorithm autonomously propagates the color to subsequent frames.

Inclusion Matching example

Attribute Control: Continuous, Subject-Specific Attribute Control in T2I Models by Identifying Semantic Directions

Attribute Control enables fine-grained control over attributes of specific subjects in text-to-image models. This lets you modify attributes like age, width, makeup, smile and more for each subject independently.

Attribute Control example

Video generation and editing

Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance

It’s been a while since the Animate Anyone drama. Champ is the next iteration of the idea of generating videos of anyone with a single image and a bit of motion guidance.

Stylized Champ example

TRIP: Temporal Residual Learning with Image Noise Prior for Image-to-Video Diffusion Models

TRIP is a new approach to image-to-video generation with better temporal coherence.

TRIP example

AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation

AniPortrait can generate high-quality portrait animations driven by audio and a reference portrait image. It also supports face reenactment from a reference video.

AniPortrait example. Check the project page for examples with audio.

StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text

StreamingT2V enables long text-to-video generations featuring rich motion dynamics without any stagnation. It ensures temporal consistency throughout the video, aligns closely with the descriptive text, and maintains high frame-level image quality. Videos can be up to 1200 frames, spanning 2 minutes, and can be extended for even longer durations.

Wide shot of battlefield, stormtroopers running...

Spectral Motion Alignment for Video Motion Transfer using Diffusion Models

Spectral Motion Alignment is a framework that can capture complex and long-range motion patterns within videos and transfer them to video-to-video frameworks like MotionDirector, VMC, Tune-A-Video, and ControlVideo.

Spectral Motion Alignment example

Also interesting

  • Octree-GS: Towards Consistent Real-time Rendering with LOD-Structured 3D Gaussians
  • EVA: Zero-shot Accurate Attributes and Multi-Object Video Editing
  • InterDreamer: Zero-Shot Text to 3D Dynamic Human-Object Interaction
  • Blur2Blur: Blur Conversion for Unsupervised Image Deblurring on Unknown Domains
  • latentSplat: Autoencoding Variational Gaussians for Fast Generalizable 3D Reconstruction
  • DragAPart: Learning a Part-Level Motion Prior for Articulated Objects

β€œFlora’s Embrace πŸŒΊβ€ by me

And that my fellow dreamers, concludes yet another AI Art weekly issue. Please consider supporting this newsletter by:

  • Sharing it πŸ™β€οΈ
  • Following me on Twitter: @dreamingtulpa
  • Buying me a coffee (I could seriously use it, putting these issues together takes me 8-12 hours every Friday πŸ˜…)
  • Buying a physical art print to hang onto your wall

Reply to this email if you have any feedback or ideas for this newsletter.

Thanks for reading and talk to you next week!

– dreamingtulpa

by @dreamingtulpa