AI Art Weekly #29

Hello there my fellow dreamers and welcome to issue #29 of AI Art Weekly! 👋

We’re closing in on 1500 subscribers. Would be cool if we could reach that goal for issue #30. If you know someone who might be interested in this newsletter, please forward it to them. And as always, ideas on how to improve the newsletter are always welcome. Just respond to this email.

I have been building some helper tools with GPT-4 this week to aggregate AI news, images and videos that end up within the newsletter. I must say that the productivity boost from it feels great. Hopefully this will make it more time-efficient to put these condensed issues together starting next week. Which means more time for me to work on adding new quality of life features to the website, which have been long overdue. But for now, let’s focus on what happened this week:

  • Stable Diffusion XL Beta available
  • ControlNet 1.1 released
  • Generative Agents 🦾
  • Interview with AI artist and writer Rebel Without Applause
  • SceneDreamer code released – generate unbounded 3D scenes
  • AudioLDM Google Colab – generate audio from text

Cover Challenge 🎨

Theme: surrealism
110 submissions by 64 artists
AI Art Weekly Cover Art Challenge surrealism submission by jakkvega
🏆 1st: @jakkvega
AI Art Weekly Cover Art Challenge surrealism submission by VANYAxp
🥈 2nd: @VANYAxp
AI Art Weekly Cover Art Challenge surrealism submission by Dazreil
🥉 3rd: @Dazreil
AI Art Weekly Cover Art Challenge surrealism submission by OakOrobic
🧡 4th: @OakOrobic

Reflection: News & Gems

Stable Diffusion SDXL Preview

Stability AI released the SDXL (Stable Diffusion XL) Beta model this week. It’s currently available within DreamStudio and their API. The model weights aren’t open-sourced yet unfortunately, but they announced that this will happen after some beta testing.

Some of the highlights of SDXL’s capabilities include:

  • Next-level photorealism capabilities
  • Enhanced image composition and face generation
  • Rich visuals and jaw-dropping aesthetics
  • Use of shorter prompts to create descriptive imagery
  • Greater capability to produce legible text

Inpainting and non-square aspect ratios aren’t available yet. Although first results don’t look that promising compared to Midjourney, I’m mostly excited about what the community will make out of this.

SDXL Beta model inpainting preview

ControlNet 1.1

ControlNet 1.1 landed this week. All in all it includes updated and improved models, but there are some new noteworthy additions.

First of all the Openpose model is now able to work with hands and faces. No need for cumbersome workarounds to place hands.

Second of all there are new guidance methods: line art, shuffle, pix2pix, inpaint and tile. Personally I’m most excited to try tile guidance. It splits up an image into a tiles. For a given tile, it recognizes what is inside the tile and increase the influence of that recognized semantics, and it also decreases the influence of global prompts if contents do not match.

ControlNet 1.1 Openpose example

Generative Agents: Interactive Simulacra of Human Behaviour

Autonomous AI agents like AutoGPT and babyagi are the latest craze in AI world. The basic idea is to provide an LLM with a goal-oriented prompt, provide an environment with access to a web browser or local files, combine it with memory and then let it come up with tasks to reach its goals through a self-referential loop. Sounds similar to what we as humans do? Yes, that’s why some people in theory think AGI isn’t that far away anymore. In practice? Well, I told it to write this newsletter and it failed miserably. So, still crunch-time for me 😅

Nonetheless, given the rapid pace of our advancement, what holds true today may not hold true next week. And writing a newsletter isn’t the only application for agents. Enter Generative Agents – computational software agents that simulate believable human behaviour.

Those generative agents wake up, cook breakfast, and head to work; artists paint, while authors write; they form opinions, notice each other, and initiate conversations; they remember and reflect on days past as they plan the next day. Thankfully not in a western theme park for now, but in an interactive sandbox environment inspired by The Sims – a small 2D town that consists of twenty five of those agents. Because the whole thing is interactive, end users are able to prompt and influence the behaviour of agents and can observe how things will play out.

The code hasn’t been open-sourced, so you can’t play around this with yourself, but you can observe a pre-recorded world here.

Apart from the aparant Westworld implications, I can see this implemented in video games as a way to create fully automated believable NPCs. Crazy times.

Generative Agents doing their thing

Rich-Text-To-Image: Expressive Text-to-Image Generation with Rich Text

I love to see explorations on how we can prompt differently compared to just using plain words. Rich-Text-To-Image introduces the ability to use format information from rich text like font size, color, style, and footnote for text-to-image generation. Looks like fun and you can try out the demo on HuggingFace.

Rich-Text-To-Image demo

vid2vid-zero for Zero-Shot Video Editing

vid2vid-zero is yet another video-to-video model. Compared to other methods, this one doesn’t require you to train or fine-tune a model before you can use it. The code just got released this week and you can try it out on HuggingFace.

vid2vid-zero examples

InstantBooth: Personalized Text-to-Image Generation without Test-Time Finetuning

What if you could generate images from an untrained concept by providing a few images and without having to fine-tune a model first? InstantBooth from Adobe might be the answer. The novel approach is built upon pre-trained text-to-image models that enables instant text-guided image personalization without finetuning. Compared to methods like DreamBooth and Textual-Inversion, InstantBooth model can generate competitive results on unseen concepts concerning language-image alignment, image fidelity, and identity preservation while being 100 times faster. Wen open-source?

InstantBooth comparison

DreamPose: Fashion Image-to-Video Synthesis via Stable Diffusion

DreamPose is an image-to-video model that can take an input image of a person and pose sequence, and generate a photorealistic video of the input person following the pose sequence. The consistency of the output looks good, so this might be useful to animate characters.

DreamPose examples

More interesting research & gems

  • VidStyleODE: Disentangled Video Editing via StyleGAN and NeuralODEs
  • Zip-NeRF: Anti-Aliased Grid-Based Neural Radiance Fields
  • How you feelin’?: Learning Emotions and Mental States in Movie Scenes
  • Control3Diff: Learning Controllable 3D Diffusion Models from Single-view Images
  • Single-Stage Diffusion NeRF: A Unified Approach to 3D Generation and Reconstruction
  • Mesh2Tex: Generating Mesh Textures from Image Queries
  • DiffusionRig: Learning Personalized Priors for Facial Appearance Editing

Imagination: Interview & Inspiration

In this weeks issue of AI Art Weekly we talk with Rebel Without Applause aka @TymothyLongoria. I’m following Tymothy’s path since he first participated in the “cyberpunk” challenge and loved to see how his style evolved over the last few months. So I thought it’s time to ask him a few questions. Let’s jump in.

What’s your background and how did you get into AI art?

I have been a writer for nearly 20 years, and I have been writing seriously for about 14 of those years. This journey led me to opportunities such as writing for the now-defunct online macabre children’s magazine, Underneath The Juniper Tree, and acquiring an agent through the traditional query process. However, I left the community and took a break after my agent exited the business. I returned in late December 2021 and stumbled upon an image created with AI by the brilliant artist Nekro. I felt compelled to try it for myself, and the creative spark was reignited. And so, here we are, taking another step in my journey.

“Untitled” by Nekro

Do you have a specific project you’re currently working on? What is it?

Through experimentation and my passion for lore and worldbuilding, I have created a character named Stelle. Although she is more than just a project, I consider her to be a character within my enigmatic wilderness. At the moment, Stelle and the aforementioned wilderness are at the forefront of my creative endeavors.

Any tips for aspiring lore / worldbuilders?

You own your perspective, and only you can tell a story like you can. Your experiences, your views, the color you see the world in is yours, and that’s how you tell your story. Pay close attention to detail - people notice.

“_Stelle : She’s The One Who Likes All Our Pretty Songs” by Rebel Without Applause

What drives you to create?

The inherent need to share of my thoughts, my divergent thinking.

The desire to write.

Writing+art=prompting.

What does your workflow look like?

Concepts and ideas come naturally to me, for better or worse and I primarily use Midjourney, photomanips and Photoshop/After Effects.

What is your favourite prompt when creating art?

My favorite ways to prompt are to use phrases, that is, instead of using a man with a ten-gallon hat etc, I’ll write:

a devious and dapper entity with macabre and ominous notions

I use that and a myriad of similar dramatic prompts as foundations.

“_The Masks Of Our Fear” by Rebel Without Applause

How do you imagine AI (art) will be impacting society in the near future?

We’re already seeing people with no artistic background finding their way in and around and through the creative world. Those who before could only imagine, can now /imagine. And those who hone their skills, can only become better. Just as with the creation of art, non-writers can now become writers and poets with the help of ChatGPT. This hits me personally, and reminds me to empathize with traditional artists who for whatever reasons are against art creation with AI. I say this sincerely: it won’t impact me negatively, and I can honestly say I might give it a try to help lay the foundations for future stories and world-building.

“_The Many Faces of Stelle” by Rebel Without Applause

Who is your favourite artist?

I didn’t have a favorite band until a few years ago, so “favorite” isn’t a term I often use.

However, one artist that come to mind are 0009, which has a brilliant aesthetic blend of color and rebellion. Other notable names are OTHERfaces_Ai, Aempatia, Trez Art, Doc T, Manu, and BLΛC.ai, to name just a few.

Bellini and Caravaggio also come to mind, as I tend to favor a darker, more ominous palette.

“Untitled” by DocT

You said you didn’t have a favourite band until “now” – which band is that?

Absolutely, and without a doubt - The Contortionist.

Anything else you would like to share?

Lastly let me say thank you for this opportunity to speak on a few things about my journey, and to leave the reader with this:

Find joy in what you do - in your art, your process, your style, and your expression. If you can, this will never get old.


Creation: Tools & Tutorials

These are some of the most interesting resources I’ve come across this week.

find joy in what you do, abstract surreal dream interpretation --niji 5 --style expressive --ar 3:2 --no text, letters, typography by me

And that my fellow dreamers, concludes yet another AI Art weekly issue. Please consider supporting this newsletter by:

  • Sharing it 🙏❤️
  • Following me on Twitter: @dreamingtulpa
  • Buying me a coffee (I could seriously use it, putting these issues together takes me 8-12 hours every Friday 😅)

Reply to this email if you have any feedback or ideas for this newsletter.

Thanks for reading and talk to you next week!

– dreamingtulpa

by @dreamingtulpa