AI Art Weekly #49

Hello there, my fellow dreamers, and welcome to issue #49 of AI Art Weekly! 👋

I’ve got another great issue for you this week, but before we jump in, just a quick update that there will be no issue next week. Haven’t done a break since January and I feel like I need one. I’ll be back with issue #50 on the 15th of September. So, let’s get started. The highlights of this week are:

  • Google released their own image generator: Ideogram
  • MS-Image2Video: Open-Source Image-to-Video
  • Dysen-VDM: Text-to-Video with better motion
  • MagicAvatar: Multi-modal Avatar Generation and Animation
  • MagicEdit: Video outpainting and a possible Gen-1 competitor 🤯
  • Total Selfie: Helps you generate full-body selfies 😅
  • Interview with artist Rocketgirl 🚀
  • AI Eric Cartman song cover
  • 3D Gaussian Splatting Tutorial
  • and more guides, tools and gems!

Cover Challenge 🎨

Theme: echoes
151 submissions by 88 artists
AI Art Weekly Cover Art Challenge echoes submission by xizzikizzix
🏆 1st: @xizzikizzix
AI Art Weekly Cover Art Challenge echoes submission by psymulate
🥈 2nd: @psymulate
AI Art Weekly Cover Art Challenge echoes submission by UltimAI1138
🥈 2nd: @UltimAI1138
AI Art Weekly Cover Art Challenge echoes submission by weird_momma_x
🥉 3rd: @weird_momma_x

News & Papers

Ideogram: Google’s answer to MidJourney, DreamStudio and Dall-E

Google released their own image generation service called Ideogram last week. The service is still very limited but is very good at generating images with text in them. It’s currently free to use, so don’t miss out on this opportunity.

"AI Art Weekly" logo, t-shirt design, typography created with Ideogram

MS-Image2Video & MS-Vid2Vid-XL

The team behind VideoComposer (issue 37) released MS-Image2Video and MS-Vid2Vid-XL this week. It’s an open-source “alternative” to Gen-2’s and PikaLabs’ image-to-video feature. It’s not a 1 to 1 image animator though, as it’s using the base image more as an inspiration for the video that will get generated. Similar to ZeroScope, the Vid2Vid model is used to upscale the video and remove flickering and artifacts from the lower-res version. There is a HuggingFace demo and a Google Colab available for you to try it out.

MS-Image2Video example

Dysen-VDM: Empowering Dynamics-aware Text-to-Video Diffusion with Large Language Models

While ZeroScope, Gen-2, PikaLabs and others have brought us high resolution text- and image-to-video, they all suffer from unsmooth video transition, crude video motion and action occurrence disorder. The new Dysen-VDM tries to tackle those issues, and while nowhere near perfect, delivers some promising results.

Dysen example: A lady holds an umbrella, walking in the park with her friend.

MagicAvatar: Multimodal Avatar Generation and Animation

MagicAvatar on the other hands is a multi-modal framework capable of converting various input modalities (text, video, and audio) into motion signals that subsequently generate and animate an avatar. Just look at these results.

Magic Avatar demo with motion- and text-to-avatar: A boy running, red jacket

MagicEdit: High-Fidelity Temporally Coherent Video Editing

But we aren’t done with video yet. MagicEdit not only does an extremely good job stylizing and editing videos (imo comparable with Gen-1), but it also supports video-outpainting 🤯

Magic-Edit outpainting examples

Total Selfie: Generating Full-Body Selfies

Afraid of asking strangers to take an image of you? No problem. Total Selfie has got you covered as it’s able to generate full-body selfies, similar to a photo someone else would take of you at a given scene. All you need is a pre-recording of you with your current outfit and a target pose. All left is taking images of your face and scenery during the day to produce a full-body image at each location. Holidays, here I come 😅

Total Selfie examples

More papers & gems

  • NRHints: Relighting Neural Radiance Fields with Shadow and Highlight Hints
  • SketchDreamer: Interactive Text-Augmented Creative Sketch Ideation
  • HoloFusion: Towards Photo-realistic 3D Generative Modeling
  • FMB-Plus: Flexible Techniques for Differentiable Rendering with 3D Gaussians
  • C2G2: Controllable Co-speech Gesture Generation
  • SportsSloMo: A New Benchmark and Baselines for Human-centric Video Frame Interpolation
  • MVDream: Multi-view Diffusion for 3D Generation
  • InstructME: An Instruction Guided Music Edit Framework with Latent Diffusion Models


Time for some unadulterated chaos. This week, I had the pleasure of talking to Rocketgirl. If you have time for a quick, to-the-point, yet deeply inspirational interview this weekend, this is it. Enjoy!

Tools & Tutorials

These are some of the most interesting resources I’ve come across this week.

a photograph without figures who are not wearing suits, in the style of utopian visions, bright impressionism, utopian countryside, fresh/intact, tea culture, diminutive, noise reverse meaning prompt generated from another prompt by @moelucio using ChatGPT by me

And that my fellow dreamers, concludes yet another AI Art weekly issue. Please consider supporting this newsletter by:

  • Sharing it 🙏❤️
  • Following me on Twitter: @dreamingtulpa
  • Buying me a coffee (I could seriously use it, putting these issues together takes me 8-12 hours every Friday 😅)

Reply to this email if you have any feedback or ideas for this newsletter.

Thanks for reading and talk to you next week!

– dreamingtulpa

by @dreamingtulpa