AI Art Weekly #21
Hello there my fellow dreamers and welcome to Issue #21 of AI Art Weekly! 👋
I’ve recently been thinking about how to expand the aiartweekly.com website with some sort of community features. I’m unsure to what this should entail, and instead of just building something no one has a use for, I wanted to ask you all directly if there is something you would like to see? This can be literally anything that comes to your mind. One thing I’m currently exploring is a way for people to create art together. But your need can be as simple as receiving feedback to your work. Would love to hear from you.
Without further ado, let’s jump in. This is another packed issue with lots of cool stuff. The highlights are:
- ControlNet and T2I-Adapter. Some say the biggest leap in AI art since the public release of Stable Diffusion.
- Interview with AI artist @arvizu_la. Three times Cover Art Challenge winner.
- LoRA+DreamBooth notebook.
- First Gen-1 examples are in.
- InstructPix2Pix Video with Coherence Guidance colab for us plebs who still have to wait for Gen-1 😅
Cover Challenge 🎨
The challenge for this weeks cover was “ruins” which received 55 submissions from 33 artists. The community decided on the final winner:
Congratulations to @nova_visualss for winning the cover art challenge 🎉🎉🎉 and a big thank you to everyone who found the time to contribute!
The next challenges theme is “img2img” and is all about what you can create with one (or more) input images. You’re free to use any tools you want to achieve the job. Midjourney /blend
, regular img2img, InstructPix2Pix or the new ControlNet (see below) are all fair game. The only requirement is that you include the input images alongside your submissions. The reward is another $50. Rulebook can be found here and images can be submitted here.
I’m looking forward to all of your submissions 🙏
If you want to support the newsletter, this weeks cover is available for collection on objkt as a limited edition of 10 for 3ꜩ a piece. Thank you for your support 🙏😘
Reflection: News & Gems
ControlNet
Welcome to what is arguably the most exciting update since the open-source release of Stable Diffusion v1.4. Are you tired of having insufficient control in your text2img and img2img workflow? Well, with the launch of ControlNet, that’s about to change significantly.
ControlNet introduces a novel approach to controlling diffusion models, providing new methods to preserve and control the structure of an input image. This toolkit offers various methods, including Canny Edge and HED Boundary detection, both preserving essential details for recoloring and stylizing images. The M-LSD Lines detection method is excellent for maintaining the structure of interior and architectural content. OpenPose detects the pose of an input image and generates a new image while preserving the original pose. Then there is Semantic Segmentation splits an input image into different parts before generating, and Depth and Normal Map, which basically adds depth2img functionality to Stable Diffusion v1.5.
Personally, I find the User Scribbles method is particularly impressive, allowing us to hand draw images and use the lines and shapes of our drawings as guidelines for image generation. It’s just so much fun to play around with! For those less inclined to scribble, the Fake Scribbles method turns an input image into a scribble first.
You can for example prompt Midjourney for a hand sketch
or minimalistic pencil scribble
, input it into ControlNet and have it retain the overall structure of the generated sketch but transform it significantly depending on the method you’re using. I shared some examples this week.
And finally, there is an Anime Line Drawing model that is currently being tested.
Now if you want to use ControlNet, you have a few options.
- Automatic1111 Plugin. It’s listed on the extension tab within the UI, install it and then download the ControlNet models to the plugin directory. No need to download the 5GB models, there are trimmed safetensors which are 723mb each and I haven’t been able to spot a difference quality wise.
- HuggingFace Space is the easiest way to experience ControlNet if you don’t mind to wait a bit. Optionally you can duplicate the space to skip the queue.
- There is also a Jupyter notebook for installing Automatic1111 with ControlNet on cloud environments like Google Colab.
T2I-Adapter
But as usual, if one new approach for doing something with AI comes along, it probably comes with an alternative alongside it. Tencent Arc Lab, the developers of GFPGAN, a real-world face restoration model some of us used before Midjourney was able to generate beautiful faces, released T2I-Adapter. T2I-Adapter also aims to provide more control to the diffusion process by implementing guidance for sketch, keypose and segmentation maps. I haven’t tested T2I-Adapter yet, but these seem to work fairly similar to ControlNet. What’s new is the ability to apply local and sequential edits to existing images, something I haven’t been able to achieve with ControlNet so far. The adapters are also composable, meaning that you can use a segmentation map with a sketch to further improve the guidance of the generated image. Code and examples are available in the linked GitHub repo above. There is no HuggingFace or Colab yet, but I’m sure this will change by next week.
PVDM: Video Probabilistic Diffusion Models in Projected Latent Space
PVDM is yet another video generation model. This one is able to generate long and short form videos between 16 and 128 frames. With most of these, and this is no exception, there is no code available. Resuts look promising though.
EVA3D: Compositional 3D Human Generation from 2D Image Collections
Let’s look at 3D. This week we got EVA3D. A model that is able to create meshed and textured 3D humans from a 2D image collection training. While the output still looks a lot like N64 and PS1 characters, the amazing thing is that EVA3D has explicit control over the poses which apparently lets you animate the generated humans.
SinMDM: Single Motion Diffusion
Talking about animation, SinMDM is able to take a single input motion sequence with arbitrary skeletal topology and generate and synthesize additional motions that are faithful to the original input sequence. This enables the ability to expand upon existing animations, target only selected joints for input and generate the rest, or apply a style transfer, for example turning a walking animation into a crouched animation.
Gen-1 examples
A few lucky ones already got access to Gen-1 and shared their results. Here is a list of a few examples. FOMO is real with this one 🙉.
Gems
Stumbled upon a lot of gems this week. Thanks to @spiritform for sharing some of these with me. If you’re interested in helping out with providing content for AI Art Weekly, reach out to join our (still very early days) Discord community.
@Oranguerillatan put togehter a music video for the band “Dead Man’s Couch” and the intro is just magic *chefskiss*.
@bensartnoodles tested ControlNet with a set of video frames and although the results still produce a lot of flicker depending on the scene and prompt, this one looks super cool.
@peteromallet is working on Banodoco, an open-source tool that combines multiple AI models to create coherent animated videos. He just showcased a vid2vid example of v0.2 and it looks amazing! You can currently signup to become a beta tester.
Boy, this week keeps on giving. @justLV shared his process behind applying more temporal consistency when modifying videos content with Stable Diffusion.
@ryunuck is working on what looks like a Deforum editor with visualized keyframe graphs for different audio stems. Looks cool.
@ouhenio is as well working on a vid2vid pipeline called Dreamcatcher. Also worth keeping an eye on.
Imagination: Interview & Inspiration
In Today’s issue of AI Art Weekly, we talk to AI artist @arvizu_la. Arvizu has caught my attention over and over again since I started writing this newsletter with his captivating submissions to the Cover Art Challenge which he won three times (checkout covers of issue #15, #17 and #19). I am an admirer of his creations, which is why I wanted to interview him, and I’m grateful that he agreed to participate. So let’s dive in!
[AI Art Weekly] Arvizu, what’s your background and how did you get into AI art?
I’m a computer science major and I currently work as a substitute worker at an oil company in my hometown in Mexico. My work duties include office work, firefighting, conducting chemistry experiments in the laboratory, and assisting with various other tasks. Although it may seem unusual, it is commonplace here.
In July 2022, I became interested in AI art after learning about DALL-E and Midjourney. Drawing and art were once a frustrating dream of mine, but since discovering these tools, I decided to give it a try and have been captivated ever since.
[AI Art Weekly] Do you have a specific project you’re currently working on? What is it?
I don’t have any special projects going on right now, but I’m currently working on an animation for someone I have looked up to since my teenage years. To my surprise, after viewing some of my previous works, he expressed his admiration for my skills. Apart from this, I’m considering delving into content creation in my language. I have received numerous messages on Instagram and TikTok from individuals seeking my assistance with AI tools, so I might give it a try. However, I do this purely for fun.
[AI Art Weekly] What does your workflow look like?
To find inspiration, I enjoy taking afternoon walks as it helps me to unwind and sparks my creativity. I draw ideas from a variety of sources such as media, personal experiences, and anything that piques my interest. Upon discovering Midjourney, I began experimenting with dream prompts which have resulted in some fascinating concepts. I find that movies, anime, books, video games, and music from diverse artists like Tool, Die Antwoord, Megadeath, and Kanye, all contribute to my creative process. I’m not limited to any particular genre and enjoy exploring anything that resonates with me.
To simplify my creative process, I primarily use Midjourney and Niji for their easy prompts, which I frequently post on Instagram. It’s impressive that you can prompt from your phone, and I find it quite convenient. Additionally, I use Stable Diffusion for more specific outputs. With Stable Diffusion, I can train models and embeddings or choose from the community’s options.
My workflow entails photobashing using Midjourney prompts. By using simple prompts, I acquire the necessary elements for my main picture, which I paste onto a Photoshop canvas. Once complete, I return to Midjourney or Stable Diffusion and use img2img to improve coherence and correct any composition errors. With a polished output, I use Stable Diffusion in Photoshop to inpaint to enhance detail and resolution, and perhaps add a few touches. If the photobashing is visually appealing or meets my desired specifications, I proceed with the image as is. I’m not a Photoshop expert, I just know the basics. Stable Diffusion does the heavy lifting.
[AI Art Weekly] What is your favourite prompt when creating art?
Other than the words “woman” or “girl,” one of my preferred prompt words is “deep shadows” and “oil on canvas.” I’m captivated by the virtual brushstrokes, texture, and lighting that the AI generates. Example 1. Example 2.
[AI Art Weekly] How do you imagine AI (art) will be impacting society in the near future?
Everyone has a unique story to share, and modern tools have made it easier than ever to express oneself creatively. I find it truly inspiring to witness the imaginative worlds others create, which might never have existed otherwise. Looking towards the future, I’m optimistic about the incredible new heights of creativity that AI art will bring us.
Of course, as the volume of art increases, there may be potential downsides, such as falling prices or certain individuals experiencing hardship while others thrive. However, these issues are simply part of progress and the natural ebb and flow of history. As our wants and needs are limitless, there will always be something new and exciting to explore.
While some may lament what is lost, I prefer to focus on the fresh opportunities that will emerge for new jobs and artistic creations. Additionally, many individuals will continue to produce art in more traditional ways simply because they love it, similar to many other vocations.
[AI Art Weekly] Who is your favourite artist?
Well, I have a lengthy list, but I’ll only mention a few. When it comes to illustrators, Simon Bisley captures everything I adore about artworks: intricate details, energetic poses, and compositions with a hint of violence. As for AI, I primarily follow Asian creators like @tohofrog, @8co28, @Muacca. Then there’s a small account that I like because he explores random themes: @kitsunezaka55. These individuals play with prompts in unique ways, resulting in textures and compositions that I never would have thought of. Every time MemoryMod appears in my feed, it’s a lovely surprise (checkout the interview in issue #12). On the other hand, for traditional painters, I appreciate William Adolphe Bouguereau’s work. His meticulous attention to detail is unparalleled for its time.
[AI Art Weekly] Anything else you would like to share?
I would like to express my gratitude for your efforts. Being acknowledged is a priceless blessing, and I’m thankful for it. I also extend my appreciation to all those reading this. If you have a passion for this medium, keep encouraging, learning, and sharing. Let us show support for the art and the individuals who are striving to expand it further.
Creation: Tools & Tutorials
These are some of the most interesting resources I’ve come across this week.
@KaliYuga_ai created a fork of the LoRA-enabled Dreambooth notebook and extended it with BLIP functionality to autocaption your image dataset. I’m gonna use this to train my next fine-tune with LoRA.
Tired of spelling out all same negative prompts over and over again? The EasyNegative embedding has you covered. It was trained on the Counterfeit model, but should also work (more or less) with other models.
@johnowhitaker put together a Google Colab to generate more coherent frame by frame consistency when using the IP2P method. Example 1. Example 2.
If you’re looking to play around with a new fine-tuned SD model or just want to browse a bit, @HuggingFace created a Diffusers Model Gallery to easily do just that.
And that my fellow dreamers, concludes yet another AI Art weekly issue. Please consider supporting this newsletter by:
- Sharing it 🙏❤️
- Following me on Twitter: @dreamingtulpa
- Leaving a Review on Product Hunt
- Using one of our affiliate links at https://aiartweekly.com/support
Reply to this email if you have any feedback or ideas for this newsletter.
Thanks for reading and talk to you next week!
– dreamingtulpa