AI Art Weekly #46
Hello there, my fellow dreamers, and welcome to issue #46 of AI Art Weekly! 👋
Some of the tech that got released this week is beyond stunning, and while it will bring a lot of utility, a word of caution: We’ve now fully arrived in the age of deep fakes and everything you see or hear online can be fake. Keeping your personal data safe is now more essential than ever because everybody can use these tools. So please be careful where you upload and share your image, voice and video data online. Having said all that, this tech will not go away, but only evolve and become better and better, so lets not gloom and have fun with it instead! Here are the highlights of the week:
- HeyGen will soon support 100% AI generated fully voiced and gesturing realistic avatars
- PlayHT2.0 can generate speech with cloned accents and promptable emotions
- AudioLDM2
- LED can denoise low-light images trained on only 6 pairs of images
- PlankAssembly can convert 2D line drawings into 3D CAD models
- Interview with digital artist Anya Asano
- ControlNet Canny for SDXL
- AutoTrain Dreambooth Colab
- and more
Twitter recently shut down free API access which puts our weekly cover challenges at risk. By becoming a supporter, you can help me make AI Art Weekly and its community efforts more sustainable by supporting its development & growth! 31/100% reached so far 🙏
Cover Challenge 🎨
Unfortunately the finalists poll got tampered with this week, which is why I had to disqualify the results and decide on the winner myself. For next weeks challenge I’ll either decide on the winner directly or put together a committee of people I trust to decide on the winner. I’m sorry for the inconvenience, but I hope you can understand.
For next weeks cover I’m looking for fire, water, earth and wind inspired artworks. The reward is $50. Rulebook can be found here and images can be submitted here. Come join our Discord to talk challenges. I’m looking forward to your submissions 🙏
News & Papers
HeyGen
HeyGen’s founder Joshua Xu tweeted a video of a work in progress feature to create 100% AI generated realistic and fully voiced videos with body gestures. Apparently it will get rolled out soon and you can signup for their waitlist here. If you do so, you’ll receive a demo video from Joshua’s avatar talking about the information you submitted – which can be easily exploited… Their tech apparently needs two minutes of video of a person to recreate a realistic digital avatar of them.
PlayHT2.0
PlayHT2.0 got released this week. The new state-of-the-art generative voide AI model is not only able to generate speech with a more natural flow and intonation and has the ability to direct emotions and is able to clone accents over to other languages. The crazy part: Cloning only requires a three seconds of audio and speech can be generated in real-time with a latency of 800ms.
AudioLDM 2
The next iteration of AudioLDM got released this week. Aside from improved quality for text-to-audio and text-to-music output, the new model is also capable of text-to-speech as well as image-to-audio, generating music and audio effects from images only. There is a HuggingFace demo to play around with the model here.
Lighting Every Darkness in Two Pairs: A Calibration-Free Pipeline for RAW Denoising
RIP expensive low-light cameras? It’s amazing how AI is able to solve problems which so far was only possible with better hardware. In this example the novel LED model is able to denoise low-light images trained on only 6 pairs of images. The results are impressive, but the team is not done yet. They’re currently researching a method that works on a wide variety of scenarios trained on only 2 pairs.
PlankAssembly: Robust 3D Reconstruction from Three Orthographic Views with Learnt Shape Programs
Carpenters and 3D printers rejoice. PlankAssembly is a new method that can convert 2D line drawings from three orthographic views into 3D CAD models. It even lets users scale and move planks to edit generated models. I’ve two left hands when it comes to crafting, but this seems like a great tool for planners.
More papers & gems
- LayoutLLM-T2I: Eliciting Layout Guidance from LLM for Text-to-Image Generation
- PHDiffusion: Painterly Image Harmonization
- Mirror-NeRF: Learning Neural Radiance Fields for Mirrors with Whitted-Style Ray Tracing
- AvatarVerse: High-quality & Stable 3D Avatar Creation from Text and Pose
- 3D Gaussian Splatting for Real-Time Radiance Field Rendering
@A_B_E_L_A_R_T made a hilarious short for the Pika Labs meme contest 😂
@n_reruns created an AI-generated Celebrity Mortal Kombat roster with Gen-2, Midjourney and ElevenLabs.
@_vnderworld put together a process and overview thread dissecting his “Bourgeois Revolution” piece created for the Zeitgeist collection.
Interviews
This week I had the pleasure to interview digital artist Anya Asano. Anya’s AI infused glitch art work lately catched my eye and I wanted to learn more about the process behind that work. Happy Anya agreed to share some insights with us.
Tools & Tutorials
These are some of the most interesting resources I’ve come across this week.
The first ControlNet model with canny conditioning for SDXL 1.0 is out. If you’re interested in training ControlNet for SDXL, there is also a helpful guide available.
Speaking about deep fakes. DeepFaceLab is a an open-source software to alter faces in existing images and video footage. It can replace faces, heads, de-age faces and manipulate lip-syncing.
HuggingFace released a Google Colab notebook for Dreambooth training that works with SDXL, SD2/2.1 as well as SD1.5.
Jukebox Diffusion created by @jmoso13 is a hierarchical latent diffusion model that is able to generate music. You can learn more about it in this medium blog post.
@bram_wallace trained a new text-to-image model from scratch for the cost of $75K which is comparable to SD1.5/2.1. He shared his findings in this article and it’s an interesting read.
@fffiloni created yet another HuggingFace demo. This one create a story from a single image. It uses CLIP Interrogator to create a caption of the image and then feeds thata caption into Llama2 to generate a story.
And that my fellow dreamers, concludes yet another AI Art weekly issue. Please consider supporting this newsletter by:
- Sharing it 🙏❤️
- Following me on Twitter: @dreamingtulpa
- Buying me a coffee (I could seriously use it, putting these issues together takes me 8-12 hours every Friday 😅)
Reply to this email if you have any feedback or ideas for this newsletter.
Thanks for reading and talk to you next week!
– dreamingtulpa