AI Art Weekly #104

Hello there, my fellow dreamers, and welcome to issue #104 of AI Art Weekly! πŸ‘‹

Models are getting better and better and moats are vanishing as fast as they are created. At the beginning of the week I stumbled upon a tool that lets you quickly add text behind subjects in images, so I was curious if I could replicate it with open-source models that run 100% in the browser. And I did: Text Behind Image. It’s free and even runs on mobile devices, so give it a try and let me know what you think!

Also added another 10 high-quality SREF codes to PROMPTCACHE this week. If you’re looking for inspiration, you should definitely check them out.


Cover Challenge 🎨

Theme: e/acc
36 submissions by 22 artists
AI Art Weekly Cover Art Challenge e/acc submission by EternalSunrise7
πŸ† 1st: @EternalSunrise7
AI Art Weekly Cover Art Challenge e/acc submission by stonekaiju
πŸ₯ˆ 2nd: @stonekaiju
AI Art Weekly Cover Art Challenge e/acc submission by AiMachina
πŸ₯‰ 3rd: @AiMachina
AI Art Weekly Cover Art Challenge e/acc submission by nicolasmariar
🧑 4th: @nicolasmariar

News & Papers

Highlights

RMBG-2.0

BRIA released the 2nd version of their background removal model RMBG-2.0. It’s a new state of the art model that can remove backgrounds from images. My Text Behind Image tool currently runs the predecessor model RMBG-1.4. Not sure if 2.0 also works on mobile devices, but I’ll give it a try once I figure out how to implement it.

RMBG-2.0 examples

3D

BPT: Scaling Mesh Generation via Compressive Tokenization

Scaling Mesh Generation via Compressive Tokenization can generate high-quality meshes with over 8,000 faces.

Scaling Mesh Generation via Compressive Tokenization example

GaussianAnything: Interactive Point Cloud Latent Diffusion for 3D Generation

GaussianAnything can generate high-quality 3D objects from single images or text prompts. It uses a Variational Autoencoder and a cascaded latent diffusion model for effective 3D editing.

GaussianAnything example

Edify 3D: Scalable High-Quality 3D Asset Generation

Edify 3D can generate high-quality 3D assets from text descriptions. It uses a diffusion model to create detailed quad-mesh topologies and high-resolution textures in under 2 minutes.

Edify 3D example

StdGEN: Semantic-Decomposed 3D Character Generation from Single Images

StdGEN can generate high-quality 3D characters from a single image in just three minutes. It breaks down characters into parts like body, clothes, and hair, using a transformer-based model for great results in 3D anime character generation.

StdGEN example

ProEdit: Simple Progression is All You Need for High-Quality 3D Scene Editing

ProEdit can edit 3D scenes by breaking tasks into smaller steps, which helps reduce errors in different views. It has a smart scheduler that considers task difficulty and lets users control how aggressive the edits are, achieving high-quality results without extra tools.

ProEdit example

DanceFusion: A Spatio-Temporal Skeleton Diffusion Transformer for Audio-Driven Dance Motion Reconstruction

DanceFusion can generate and reconstruct dance pose movements that match music.

DanceFusion example

Image

MagicQuill: An Intelligent Interactive Image Editing System

MagicQuill enables efficient image editing with a simple interface that lets users easily insert elements and change colors. It uses a large language model to understand editing intentions in real time, improving the quality of the results.

MagicQuill example

Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models

Add-it can add objects to images based on text prompts without extra training. It uses a smart attention system for natural placement and consistency, achieving top results in image insertion tasks.

Add-it example

SeedEdit: Align Image Re-Generation to Image Editing

SeedEdit is a image model that can revise images based on text prompts while keeping a balance between changing and maintaining the original image. It allows for high-resolution editing and supports various changes like local replacements, geometric transformations, and style adjustments.

SeedEdit example

Video

DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion

DimensionX can generate photorealistic 3D and 4D scenes from a single image using controllable video diffusion.

DimensionX example

SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation

SG-I2V can control object and camera motion in image-to-video generation using bounding boxes and trajectories

SG-I2V example

MVideo: Motion Control for Enhanced Complex Action Video Generation

MVideo can generate long videos with smooth actions by using mask sequences for motion control.

Motion Control for Enhanced Complex Action Video Generation example

ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning

ReCapture can generate new videos with different camera angles from a single video.

ReCapture example

Also interesting

β€œGN dreamers βœ¨β€ by me.

And that my fellow dreamers, concludes yet another AI Art weekly issue. Please consider supporting this newsletter by:

  • Sharing it πŸ™β€οΈ
  • Following me on Twitter: @dreamingtulpa
  • Buying me a coffee (I could seriously use it, putting these issues together takes me 8-12 hours every Friday πŸ˜…)
  • Buying my Midjourney prompt collection on PROMPTCACHE πŸš€
  • Buying access to AI Art Weekly Premium πŸ‘‘

Reply to this email if you have any feedback or ideas for this newsletter.

Thanks for reading and talk to you next week!

– dreamingtulpa

by @dreamingtulpa