AI Art Weekly #94
Hello there, my fellow dreamers, and welcome to issue #94 of AI Art Weekly! 👋
Just a heads up, if you don’t hear from me starting from next week, it’s because Mini-Tulpa has joined our realm. Yep, wifey is due any day now. Super excited and nervous at the same time. I’m gonna try to keep the newsletter going, but I might miss a week or two. I’m sure you understand 😅
On another note, I was busy exploring Midjourney SREF codes with Midjourney v6.1 this week. Already up to 40+ high-quality SREF codes on PROMPTCACHE. If you want to get lifetime access at the current price, now is the time to get in, gonna raise the price starting next week.
Some of you are already in and created some amazing art with it. Checkout this beautiful piece by Sherpa created with the Chromatic Lineage code. Please feel free to share your creations with me, gonna try to figure out a way to showcase them!
In this issue:
- 3D: MeshAnything V2, AvatarPose, UniTalker, An Object is Worth 64x64 Pixels, Head360, RayGauss, TexGen
- Image: Fast Sprite Decomposition, IPAdapter-Instruct, Lumina-mGPT, VAR-CLIP, Smoothed Energy Guidance, ProCreate, Dont Reproduce!, TurboEdit
- Video: ReSyncer
- and more!
Unlock the full potential of AI-generated art with my curated collection of Midjourney SREF codes and prompts. Last chance to get lifetime access at the current price!
Cover Challenge 🎨
For the next cover I’m looking for 🍓 submissions! Reward is again fame & glory and a rare role in our Discord community which lets you vote in the finals. Rulebook can be found here and images can be submitted here.
News & Papers
3D
MeshAnything V2: Artist-Created Mesh Generation With Adjacent Mesh Tokenization
MeshAnything V2 can generate 3D meshes from basically anything! Text, images, point clouds, NeRFs, Gaussian Splats, you name it.
AvatarPose: Avatar-guided 3D Pose Estimation of Close Human Interaction from Sparse Multi-view Videos
AvatarPose can estimate the 3D position of people from videos! It works well for groups of people close together, using special methods to make the results more accurate and realistic.
UniTalker: Scaling up Audio-Driven 3D Facial Animation through A Unified Model
UniTalker can create 3D face animations from speech input! It works better than other tools, making fewer mistakes in lip movements and performing well even with new data it hasn’t seen before.
An Object is Worth 64x64 Pixels: Generating 3D Object via Image Diffusion
An Object is Worth 64x64 Pixels can generate 3D models from 64x64 pixel images! It creates realistic objects with good shapes and colors, working as well as more complex methods.
Head360: Learning a Parametric 3D Full-Head for Free-View Synthesis in 360°
Head360 can generate a parametric 3D full-head model you can view from any angle! It works from just one picture, letting you change expressions and hairstyles quickly.
RayGauss: Volumetric Gaussian-Based Ray Casting for Photorealistic Novel View Synthesis
RayGauss can create realistic new views of 3D scenes, using Gaussian-based ray casting! It produces high-quality images quickly, running at 25 frames per second, and avoids common picture problems that older methods had.
TexGen: Text-Guided 3D Texture Generation with Multi-view Sampling and Resampling
TexGen can create high-quality 3D textures for objects using text descriptions! It uses a special multi-view technique with a pre-trained text-to-image diffusion model creating more detailed and consistent textures than other methods.
Image
Fast Sprite Decomposition from Animated Graphics
Sprite-Decompose can break down animated graphics into sprites using videos and box outlines.
IPAdapter-Instruct: Resolving Ambiguity in Image-based Conditioning using Instruct Prompts
IPAdapter-Instruct can efficiently combine natural-image conditioning with “Instruct” prompts! It enables users to switch between various interpretations of the same image, such as style transfer and object extraction.
Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining
Lumina-mGPT can create photorealistic images from text and handle different visual and language tasks! It uses a special transformer model, making it possible to control image generation, do segmentation, estimate depth, and answer visual questions in multiple steps.
VAR-CLIP: Text-to-Image Generator with Visual Auto-Regressive Modeling
VAR-CLIP creates detailed fantasy images that match text descriptions closely by combining Visual Auto-Regressive techniques with CLIP! It uses text embeddings to guide image creation, ensuring strong results by training on a large image-text dataset.
Smoothed Energy Guidance: Guiding Diffusion Models with Reduced Energy Curvature of Attention
SEG improves image generation for SDXL by smoothing the self-attention energy landscape! This boosts quality without needing guidance scale, using a query blurring method that adjusts attention weights, leading to better results with fewer drawbacks.
ProCreate, Dont Reproduce! Propulsive Energy Diffusion for Creative Generation
ProCreate boosts the diversity and creativity of diffusion-based image generation while avoiding the replication of training data. By pushing generated image embeddings away from reference images, it improves the quality of samples and lowers the risk of copying copyrighted content.
TurboEdit: Text-Based Image Editing Using Few-Step Diffusion Models
TurboEdit enables fast text-based image editing in just 3-4 diffusion steps! It improves edit quality and preserves the original image by using a shifted noise schedule and a pseudo-guidance approach, tackling issues like visual artifacts and weak edits.
Video
ReSyncer: Rewiring Style-based Generator for Unified Audio-Visually Synced Facial Performer
ReSyncer can create high-quality lip-synced videos from audio and allows for quick personalized adjustments and video-driven lip-syncing. It can transfer speaking styles and swap faces, making it ideal for virtual presenters and performers.
Also interesting
I made a Claude Artifact that can generate color palette grids which can be fed to Midjourney as a style reference image. Also made one that can extract the palette from an image. Check it out!
I’m a sucker for real-time generation. @carlosbannon shared an example of the magic of combining the control and flexibility of procedural design with generative AI and how this opens up new creative possibilities, enabling us to explore complex geometric solutions with instant feedback.
@graycrawford shared this pretty cool concept of an alternative user interface for exploring image variations.
@MartinNebelong fed Gen-3 some cursed AI images this week and it turns how the model is pretty good at keeping the footage coherent from first frame when generating a video. Super cool.
And that my fellow dreamers, concludes yet another AI Art weekly issue. Please consider supporting this newsletter by:
- Sharing it 🙏❤️
- Following me on Twitter: @dreamingtulpa
- Buying me a coffee (I could seriously use it, putting these issues together takes me 8-12 hours every Friday 😅)
- Buy my Midjourney prompt collection on PROMPTCACHE 🚀
Reply to this email if you have any feedback or ideas for this newsletter.
Thanks for reading and talk to you next week!
– dreamingtulpa