AI Art Weekly #30
Hello there my fellow dreamers and welcome to issue #30 of AI Art Weekly! 👋
It’s crazy to think that this is already the 30th time I’m writing these lines. I’ve got another week chock-full of news and gems, so let’s dive right in. The highlights are:
- Promising Text-to-Video research from NVIDIA
- Adobe is exploring generative AI for video and audio
- AMT is a new interpolation method
- Interview with artist lilyillo
- New Automatic1111 fork that fixes a lot of issues
Cover Challenge 🎨
For the next cover I’m looking for pieces that are inspired by your favourite music, so the next theme is song lyrics. Please also share the song that inspired your submission, I’ll compile a Spotify playlist from all entries. The reward is another $50. Rulebook can be found here and images can be submitted here. Come join our Discord to receive feedback and talk future challenges. I’m looking forward to all of your artwork 🙏
Reflection: News & Gems
NVIDIA Text-to-Video Research
NVIDIA published a paper this week called Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models. In short, NVIDIA found a way to train Text-To-Video models or Video Latent Diffusion Models (LDM) by fine-tuning a existing pre-trained LDM such as Stable Diffusion. The generated videos have a resolution of 1280x2048 pixels, consist of 113 frames and are rendered at 24 fps, resulting in 4.7 second long clips. Emad mentioned that the people who’ve worked on this joined Stability and are in the process of training open-source models. Great news.
Adobe is exploring generative AI for video and audio
In a blog post this week, Adobe showcased new ideas that they’re exploring to bring generative AI into its video, audio, animation, and motion graphics design apps, which could supercharge the discovery and ideation processes and cut post-production time from days to minutes. These are some of the concepts they’re exploring:
- Text to color enhancements that enable you to change color schemes, time of day, or even the seasons in already-recorded videos
- Music and sound eﬀects that let you generate royalty-free custom sounds and music to reflect a certain feeling or scene
- Generating fonts, text effects, graphics, and logos
- Script and B-roll capabilities which aim to dramatically accelerate pre-production, production and post-production workflows
When implemented, this could significantly improve the creative workflow for video editing. This might be the final straw to convince me to signup for an Adobe subscription.
AMT: All-Pairs Multi-Field Transforms for Efficient Frame Interpolation
There is a new interpolation method called AMT. Similar to RIFE and FILM, ATM creates interpolation frames between two input images. I’m not sure yet how AMT compares to the two older methods, but apparently it’s a more lightweight, faster, and accurate algorithm for Frame Interpolation, aiming to be a practical solutions for video generation from given frames.
How about editing images by masking them and referencing another image? We’ve seen similar methods to Paint-by-Skech in earlier issues before, but this one you can actually use Today. The framework works by fine-tuning a pre-trained diffusion model to complete missing regions using the reference image while maintaining sketch guidance.
DiFaRelilogo: Diffusion Face Relighting
DiFaRelilogo introduces a method for face relighting that can handle non-diffuse effects, such as global illumination or cast shadows, and produces realistic results. The method only requires 2D images to train and can be used to relight images and videos with different lighting conditions. Pretty cool.
MasaCtrl and Delta Denoising Score
InstructPix2Pix and Pix2Pix-Zero are getting company. DDS is a novel method by Google that can guide image editing towards a desired text prompt and can be used to train a zero-shot image translation model. Examples look quite promising. Then we also had MasaCtrl by TencentARC. Compared to DDS, MasaCtrl doesn’t require any fine-tuning and can be integrated into existing controllable diffusion models, like T2I-Adapter.
Text2Performer: Text-Driven Human Video Generation
We’re getting closer to real human video generation. Last week, we saw DreamPose, a model that was able to generate a video from an input image and pose-sequence. This week, there is Text2Performer. The model can be used to create videos of high quality human representations performing complex motions from text descriptions only.
More papers and gems
- Avatars Grow Legs: Generating Smooth Human Motion from Sparse Tracking Inputs with Diffusion Model
- CAMM: Building Category-Agnostic and Animatable 3D Models from Monocular Videos
- SelfUnroll: Self-Supervised Scene Dynamic Recovery from Rolling Shutter Images and Events
- StableLM: Stability Foundation open-sourced their first language models
- CondFoleyGen: Conditional Generation of Audio from Video via Foley Analogies
- NeuralField-LDM: Scene Generation with Hierarchical Latent Diffusion Models
The demo I’m most hyped this week comes from @frantzfries. He built a conversational NPC powered by ChatPGT, Whisper and ElevenLabs in an VR environment that I would absolutely love to chat with. Brother Geppetto is 🔥
@jessicard hooked up ChatGPT to a Furby and quote: “I think this may be the start of something bad for humanity” – I think she may be right.
@bobarke built WeatherPainting.com, an impressionist weather report that uses @openAI’s Dall-E API to create generative art from real-time weather data.
@TomLikesRobots is killing it with his ModelScope text-to-video tests and he found that using simple prompts with a strong style can produce consistently interesting animations that can overpower the ShutterStock logo.
Imagination: Interview & Inspiration
This week, I had the pleasure of interviewing Australian artist @lilyillo. Based in Canberra, she skillfully explores themes like identity and womanhood through a mix of traditional and digital AI mediums. With residencies in Paris and Scotland under her belt, lilyillo’s work has been showcased extensively in Australia and internationally. I’m stocked she took the time to answer my questions. Let’s jump in.
What’s your background and how did you get into AI art?
I was always creating as a young child, drawing, painting and making crafts. I grew up learning how to paint with oils and watercolour paint, how to draw, making clay forms, learning to sew & knit, and began taking lifedrawing classes at age 13 and attended them for the rest of my teenage years.
I went to university to study a Bachelor of Art Theory and a Masters in Arts Administration & Curatorial Studies, and also completed my MFA majoring in drawing. I approached the act of drawing in the broad sense of mark-making. My work prior to AI has been a mixture of mediums encompassing drawing, installation, watercolour and digital portraiture.
I began to explore with AI tools at the beginning of 2022 when we saw more accessible tools like Midjourney, Dalle2 and Stable Diffusion/DreamStudio offering beta testing etc, and I just could not look away and began in research mode, just playing to see what was possible. I was really interested in whether these tools could be used to find a personal style, because at the time, AI outputs were so similar from tools like MidJourney and Artbreeder, and I wondered if I could train the models or become so good at prompting them that I might be able to push more towards my own drawing style and aesthetic.
Do you have a specific project you’re currently working on? What is it?
I’m working on a new collection for Foundation called Woodside, which centres around ideas of craftsmanship, making and ultimately the destruction of one’s creations.
The series explores personal memories of my childhood home, which was built on a street called ‘Woodside Avenue’. I was only a young child there, and my dad, who trained as a carpenter and architect, was renovating our wooden house. He had built out new rooms, a kitchen and a living space there. Yet, he would wake in the night to hear termites eating at the wooden home, and eventually we had to knock the house down and move on.
My dad now has dementia, he just turned 80 last week, and as his memory is going, I imagine tiny termites in his mind, eating away at his memories, as fast as I try to gather them. So, I find myself talking with him about his past and his life. It has been so wonderful to connect like this, and to fuel a creative output from this has been wonderful thing to do. I gather new ideas for prompts as he tells me about different woods, or tools he used, different carpentry techniques and stories from his life. There is some therapy in this process.
Personally, the series is intrinsically linked to the act of creating within the context of these new tools of AI. I am always aware of the concerns from traditional artists while also embracing the new technology and what it has to offer us. To iterate quickly and create with these tools. And also the destruction and decay. I am drawing metaphors to the role of maker when it comes to creating with AI, some of the concerns around AI as a tool for creation and/or destruction.
What drives you to create?
I am always creating in an attempt to find some beauty, and I create as a means to process thoughts and feelings, bit by bit claiming ground, towards the goal of slowly understanding or embodying knowledge, as sense of self or a closeness to a subject.
What does your workflow look like?
Inspiration comes in seasons, as it should, and shouldn’t be forced. It usually centres around something that I have been wrestling with for a while, consciously or unconsciously. The personal is special and very important to me. I am constantly thinking about family, relationships, life, doing life – this is the way of the introvert. And what is beautiful is that I feel more and more when working with AI that initial inspiration can be built on, and the speed of iterations can help you to reach deeper much faster. When searching our own personal history, our relationships and our identity, and when we bring these things to the tools, there is so much possibility for growth and understanding.
Technically I work in various ways, sometimes using initial image prompting or drawing sketches for posing purposes etc, and then working across different AI models including Stable Diffusion, Dalle2 and Midjourney. I usually jump back and forth between AI models for inpainting & out painting as well as using Photoshop for touching up, and upscaling using Topaz.
What is your favourite prompt when creating art?
I am not sure I have a favourite prompt to be honest. I am very flexible as to how I prompt and I like to experiment often, so I think my favourite thing might be seeing how the prompt will be interpreted and then refining based on feedback.
My happiest approach, that I find leads to the most interesting interpretations, is to write my prompts in a haphazard way, using lots of commas, with reference to textures, patterns, and different materials like
cherrywood wood grain or
sassafras marquetry. In the work ‘Holes in my home’ I referenced the
spots, dots, termite holes, tracks that creep out and around him and I really like to describe what the subject of a portrait if feeling
his head hung in the delicate sorrow of loss and remorse.
How do you imagine AI (art) will be impacting society in the near future?
There is so much that is possible and so much possibly unimaginable, that sometimes I just think I want to be here now, otherwise I might just get overwhelmed. I am someone who looks backwards in order to move forward, not often looking forward too far.
Who is your favourite artist?
Okay, so there are so many artists in the AI space whom inspire and whom I would consider my contemporaries. Claire Silver, Anna Condo, Holly Herndon, Graphica, Moey P Wellington, Pale Kirill, Viola Rama, Yuma Sogo, 0009, Georgina Hooper, there are so, so, many who inspire me daily in our beautiful global artist studio.
But for outside of this immediate space, I love the work of contemporary artists including Australian painters Jordy Kerwick, Noel McKenna and Mitch Cairns, the paintings of Pakistani painter Salman Toor, illustrators Clover Robin who does beautiful paper cut collages, Aris Moore who has this beautiful naive style of drawing these charming characters. German artist Κiriakos Tompolidis – such beautiful portraits in this flattened style with beautiful patterns in a naïve style, Painter Ben Crase California artist working with imagery of American West, Swedish painter Karin Mamma Andersson.
Anything else you would like to share?
There is something so special about the global artist studio in which we are creating and growing within everyday here. I appreciate you taking the time to get to know my practice some more. Thanks!
Creation: Tools & Tutorials
These are some of the most interesting resources I’ve come across this week.
@vmandic00 is working on a fork of the popular Automatic1111 repo which is quite alive, fixes a ton of open issues and adds new features. At the time of writing, the fork is 443 commits ahead of the original master branch. Might be worth checking out.
Beta testing for Automatic1111 ControlNet v1.1 has started and apparently it works quite well. But if something broke for you and you want to test the v1.1 models without installing anything, you can use @camenduru’s colab notebook.
TAS is an interactive demo based on Segment-Anything for style transfer which enables different content regions apply different styles.
Facebook open-sourced the code for AnimatedDrawings, a library that lets you turn children’s drawings into animated characters.
After getting inspired by @NathanLands meme game, I was thinking I could maybe turn some of my AI art into memes. So I stumbled upon @AndreasRef’s MemeCam which uses BLIP image recognition and GPT-3.5 to generate captions. Only thing that it is imo missing, is an option for a smaller font-size or the option to just generate the text.
And that my fellow dreamers, concludes yet another AI Art weekly issue. Please consider supporting this newsletter by:
- Sharing it 🙏❤️
- Following me on Twitter: @dreamingtulpa
- Buying me a coffee (I could seriously use it, putting these issues together takes me 8-12 hours every Friday 😅)
Reply to this email if you have any feedback or ideas for this newsletter.
Thanks for reading and talk to you next week!