AI Art Weekly #49
Hello there, my fellow dreamers, and welcome to issue #49 of AI Art Weekly! 👋
I’ve got another great issue for you this week, but before we jump in, just a quick update that there will be no issue next week. Haven’t done a break since January and I feel like I need one. I’ll be back with issue #50 on the 15th of September. So, let’s get started. The highlights of this week are:
- Google released their own image generator: Ideogram
- MS-Image2Video: Open-Source Image-to-Video
- Dysen-VDM: Text-to-Video with better motion
- MagicAvatar: Multi-modal Avatar Generation and Animation
- MagicEdit: Video outpainting and a possible Gen-1 competitor 🤯
- Total Selfie: Helps you generate full-body selfies 😅
- Interview with artist Rocketgirl 🚀
- AI Eric Cartman song cover
- 3D Gaussian Splatting Tutorial
- and more guides, tools and gems!
As a Pro Member, you’ll be backing the evolution and expansion of AI Art Weekly. Together we’re at 45/100% of our Twitter API milestone. Join us!
Cover Challenge 🎨
It’s time for some chaos. For unprompt
I’m looking to shake things up. The idea is to take your usual prompts and modify them in whatever way you can imagine. Reverse the meaning, word shifting/scrambling, or pig latin conversions are all valid methods. But your imagination is the limit. The reward is $100 and the Challenge Winner role within our Discord community. This rare role earns you the exclusive right to cast a vote in the selection of future winners. Rulebook can be found here and images can be submitted here. I’m looking forward to your submissions 🙏
News & Papers
Ideogram: Google’s answer to MidJourney, DreamStudio and Dall-E
Google released their own image generation service called Ideogram last week. The service is still very limited but is very good at generating images with text in them. It’s currently free to use, so don’t miss out on this opportunity.
MS-Image2Video & MS-Vid2Vid-XL
The team behind VideoComposer (issue 37) released MS-Image2Video and MS-Vid2Vid-XL this week. It’s an open-source “alternative” to Gen-2’s and PikaLabs’ image-to-video feature. It’s not a 1 to 1 image animator though, as it’s using the base image more as an inspiration for the video that will get generated. Similar to ZeroScope, the Vid2Vid model is used to upscale the video and remove flickering and artifacts from the lower-res version. There is a HuggingFace demo and a Google Colab available for you to try it out.
Dysen-VDM: Empowering Dynamics-aware Text-to-Video Diffusion with Large Language Models
While ZeroScope, Gen-2, PikaLabs and others have brought us high resolution text- and image-to-video, they all suffer from unsmooth video transition, crude video motion and action occurrence disorder. The new Dysen-VDM tries to tackle those issues, and while nowhere near perfect, delivers some promising results.
MagicAvatar: Multimodal Avatar Generation and Animation
MagicAvatar on the other hands is a multi-modal framework capable of converting various input modalities (text, video, and audio) into motion signals that subsequently generate and animate an avatar. Just look at these results.
MagicEdit: High-Fidelity Temporally Coherent Video Editing
But we aren’t done with video yet. MagicEdit not only does an extremely good job stylizing and editing videos (imo comparable with Gen-1), but it also supports video-outpainting 🤯
Total Selfie: Generating Full-Body Selfies
Afraid of asking strangers to take an image of you? No problem. Total Selfie has got you covered as it’s able to generate full-body selfies, similar to a photo someone else would take of you at a given scene. All you need is a pre-recording of you with your current outfit and a target pose. All left is taking images of your face and scenery during the day to produce a full-body image at each location. Holidays, here I come 😅
More papers & gems
- NRHints: Relighting Neural Radiance Fields with Shadow and Highlight Hints
- SketchDreamer: Interactive Text-Augmented Creative Sketch Ideation
- HoloFusion: Towards Photo-realistic 3D Generative Modeling
- FMB-Plus: Flexible Techniques for Differentiable Rendering with 3D Gaussians
- C2G2: Controllable Co-speech Gesture Generation
- SportsSloMo: A New Benchmark and Baselines for Human-centric Video Frame Interpolation
- MVDream: Multi-view Diffusion for 3D Generation
- InstructME: An Instruction Guided Music Edit Framework with Latent Diffusion Models
@smallfly is experimenting a lot with the 3D Gaussian Splatting for Real-Time RF Rendering repo and tested a scene with a magnifying glass. While still not crystal clear, the result are compelling. Look at the small details/hairs on the succulent plant. Rendered in real-time.
YouTuber Optimistic Prompt put together a movie trailer using ThinkDiffusion, RunwayML, ElevenLabs and ChatGPT. Especially the voice-over is crazy good!
Did you ever think I need AI Eric Cartman to sing “Bring Me to Life” by Evanescence? Well, me neither, but we all can be wrong sometimes 🤘
Interviews
Time for some unadulterated chaos. This week, I had the pleasure of talking to Rocketgirl. If you have time for a quick, to-the-point, yet deeply inspirational interview this weekend, this is it. Enjoy!
Tools & Tutorials
These are some of the most interesting resources I’ve come across this week.
The official code implementation from the 3D Gaussian Splatting for Real-Time Radiance Field Rendering paper. @camenduru put together a Colab version.
YouTuber The NeRF Guru put together an in-depth beginners Guide that walks you through how to install 3D Gaussian Splatting for Real-Time Radiance Field Rendering and how to create your own scenes with 3D Gaussian Splats. No prior programming or command prompt experience needed.
It’s been a while since I last checked the state of Stable Diffusion WebUIs and Invoke AI got a significant update this week that puts it back onto my radar. It now supports SDXL on its unified canvas which lets you inpaint and outpaint images as well as a new node based workflows feature similar to ComfyUI among other things.
Need to fix some AI generated faces or blurry images? DiffBIR is an image restoration model that can help.
Remember the sausage fingers from the movie “Everything Everywhere All at Once”? Someone trained a model on those and now you can create an unlimited amount of people with them. Peak AI.
@iastitraia put together a long-form post about how collective unconscious mind, how it influences artists and what you can do about it.
And that my fellow dreamers, concludes yet another AI Art weekly issue. Please consider supporting this newsletter by:
- Sharing it 🙏❤️
- Following me on Twitter: @dreamingtulpa
- Buying me a coffee (I could seriously use it, putting these issues together takes me 8-12 hours every Friday 😅)
Reply to this email if you have any feedback or ideas for this newsletter.
Thanks for reading and talk to you next week!
– dreamingtulpa