AI Art Weekly #104
Hello there, my fellow dreamers, and welcome to issue #104 of AI Art Weekly! π
Models are getting better and better and moats are vanishing as fast as they are created. At the beginning of the week I stumbled upon a tool that lets you quickly add text behind subjects in images, so I was curious if I could replicate it with open-source models that run 100% in the browser. And I did: Text Behind Image. Itβs free and even runs on mobile devices, so give it a try and let me know what you think!
Also added another 10 high-quality SREF codes to PROMPTCACHE this week. If youβre looking for inspiration, you should definitely check them out.
Tired of no-code papers? Bookmark
them directly in the newsletter below and get notified once their code is released. Unlock now.
Cover Challenge π¨
For the next cover Iβm looking for threshold inspired submissions! Reward is a rare role in our Discord community which lets you vote in the finals. Rulebook can be found here and images can be submitted here.
News & Papers
Highlights
RMBG-2.0
BRIA released the 2nd version of their background removal model RMBG-2.0. Itβs a new state of the art model that can remove backgrounds from images. My Text Behind Image tool currently runs the predecessor model RMBG-1.4. Not sure if 2.0 also works on mobile devices, but Iβll give it a try once I figure out how to implement it.
3D
BPT: Scaling Mesh Generation via Compressive Tokenization
Scaling Mesh Generation via Compressive Tokenization can generate high-quality meshes with over 8,000 faces.
GaussianAnything: Interactive Point Cloud Latent Diffusion for 3D Generation
GaussianAnything can generate high-quality 3D objects from single images or text prompts. It uses a Variational Autoencoder and a cascaded latent diffusion model for effective 3D editing.
Edify 3D: Scalable High-Quality 3D Asset Generation
Edify 3D can generate high-quality 3D assets from text descriptions. It uses a diffusion model to create detailed quad-mesh topologies and high-resolution textures in under 2 minutes.
StdGEN: Semantic-Decomposed 3D Character Generation from Single Images
StdGEN can generate high-quality 3D characters from a single image in just three minutes. It breaks down characters into parts like body, clothes, and hair, using a transformer-based model for great results in 3D anime character generation.
ProEdit: Simple Progression is All You Need for High-Quality 3D Scene Editing
ProEdit can edit 3D scenes by breaking tasks into smaller steps, which helps reduce errors in different views. It has a smart scheduler that considers task difficulty and lets users control how aggressive the edits are, achieving high-quality results without extra tools.
DanceFusion: A Spatio-Temporal Skeleton Diffusion Transformer for Audio-Driven Dance Motion Reconstruction
DanceFusion can generate and reconstruct dance pose movements that match music.
Image
MagicQuill: An Intelligent Interactive Image Editing System
MagicQuill enables efficient image editing with a simple interface that lets users easily insert elements and change colors. It uses a large language model to understand editing intentions in real time, improving the quality of the results.
Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models
Add-it can add objects to images based on text prompts without extra training. It uses a smart attention system for natural placement and consistency, achieving top results in image insertion tasks.
SeedEdit: Align Image Re-Generation to Image Editing
SeedEdit is a image model that can revise images based on text prompts while keeping a balance between changing and maintaining the original image. It allows for high-resolution editing and supports various changes like local replacements, geometric transformations, and style adjustments.
Video
DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion
DimensionX can generate photorealistic 3D and 4D scenes from a single image using controllable video diffusion.
SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation
SG-I2V can control object and camera motion in image-to-video generation using bounding boxes and trajectories
MVideo: Motion Control for Enhanced Complex Action Video Generation
MVideo can generate long videos with smooth actions by using mask sequences for motion control.
ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning
ReCapture can generate new videos with different camera angles from a single video.
Also interesting
--sref 1464347777
. Pastel Phantasm is a dreamy and vintage-inspired style that combines ethereal elements and soft pastel colors to create atmospheric, photorealistic images that evoke contemplation and a sense of nostalgia.
And that my fellow dreamers, concludes yet another AI Art weekly issue. Please consider supporting this newsletter by:
- Sharing it πβ€οΈ
- Following me on Twitter: @dreamingtulpa
- Buying me a coffee (I could seriously use it, putting these issues together takes me 8-12 hours every Friday π )
- Buying my Midjourney prompt collection on PROMPTCACHE π
- Buying access to AI Art Weekly Premium π
Reply to this email if you have any feedback or ideas for this newsletter.
Thanks for reading and talk to you next week!
β dreamingtulpa