Hello there, my fellow dreamers, and welcome to issue #43 of AI Art Weekly! 👋
Another crazy week is behind us. Meta open-sourced Llama2 and Apple is apparently working on their own LLM called Apple GPT. Yours truly is moving flats tomorrow, so apologies if this issue contains a bit more typos than usual as I’m writing this issue while sitting between chairs. So let’s jump in. This weeks highlights are:
- SHOW-1: An AI system that can generate TV show episodes for existing IPs (fake or real?)
- TokenFlow: A new (almost) flickerless video-to-video edit method
- AnyDoor: Zero-shot object-level image customization
- FABRIC: Personalizing diffusion models with iterative feedback
- NIFTY: Human motion synthesis with objects in mind
- INVE: A new real-time video editing solution
- Brain2Music 🎶🧠🎵
- Interview with AI artist St Laurent Jr
- AI Johnny Cash singing ‘Barbie Girl’
- and more
Cover Challenge 🎨
News & Papers
To Infinity and Beyond: SHOW-1 and Showrunner Agents in Multi-Agent Simulations
AI generated shows are coming, probably sooner than we think! Although the opinions are split if the following one is real or just a genius marketing stunt by Matt Stone and Trey Parker. SHOW-1 is presented as a new system that can be trained to generate TV show episodes for existing IPs. The people behind SHOW-1, Fable Studio, demonstrate it by having the system trained on the South Park universe and its characters. I’ve seen a lot of AI generated content in the last 12 months, and if its real, the episode on the papers page is hands-down crazy if you consider that this is the first of its kind. What do you think? Is SHOW-1 real or a hoax?
TokenFlow: Consistent Diffusion Features for Consistent Video Editing
TokenFlow is a new video-to-video method for temporal coherent video editing with text. We’ve seen a lot of them, but this one looks extremely good with almost no flickering and requires no fine-tuning whatsoever.
AnyDoor: Zero-shot Object-level Image Customization
Inpainting is cool and all, but when you want to place an object into a scene based on an already existing image, it gets more tricky. AnyDoor aims to solve that. Not only can you place objects into a scene in an harmonious way, it also enables moving and swapping objects within a scene. Want to know how that fancy new sweater looks on you? Just take a picture and do a virtual Try-on by replacing your worn out T-Shirt with it.
FABRIC 🎨: Personalizing Diffusion Models with Iterative Feedback
FABRIC is an interesting approach to guide image generation, not with traditional fine-tuning as we know it, but with conditioning the diffusion process iteratively with human feedback. This means with the same model and prompt, each user can potentially have its own preferences picked up during image diffusion and get totally different results. There is a demo on HuggingFace where you can try it out.
NIFTY: Neural Object Interaction Fields for Guided Human Motion Synthesis
NIFTY looks like an interesting approach for human motion synthesis with object interactions. It uses Neural Object Interaction Fields to guide motion diffusion. I can’t wait to see stuff like this getting implemented into game engines. Especially at the quirky state it is at now, imagine some random NPC starting to sprint towards a chair and leisurely sit down like in the examples below 😂
INVE: Interactive Neural Video Editing
INVE is real-time video editing solution, which can assist the video editing process by consistently propagating sparse frame edits to the entire video clip. INVE’s editing pipeline supports multiple types of edits and allows users to sketch scribbles and make local adjustments (brightness, saturation, hue) to a specific region in the scene as well as edit textures and import external graphics that tracks and deforms with the moving object. Looks like so much more fun compared to current solutions.
Brain2Music: Reconstructing Music from Human Brain Activity
We’ve seen Brain-to-Image, Brain-To-Video, Brain-To-Code and now we have Brain2Music. The new approach reconstructs music wth MusicLM from fMRI data that was recorded in response to musical stimuli. Super interesting. I’m not a musician by any stretch, but when an approach like this becomes available for EEG signals I’m definitely gonna give this a spin to create some weird experimental braindance music.
More papers & gems
- DreamTeacher: Pretraining Image Backbones with Deep Generative Models
- NU-MCC: Multiview Compressive Coding with Neighborhood Decoder and Repulsive UDF
- Diff-Harmonization: Zero-Shot Image Harmonization with Generative Model Prior
- BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion
Tools & Tutorials
These are some of the most interesting resources I’ve come across this week.
And that my fellow dreamers, concludes yet another AI Art weekly issue. Please consider supporting this newsletter by:
- Sharing it 🙏❤️
- Following me on Twitter: @dreamingtulpa
- Buying me a coffee (I could seriously use it, putting these issues together takes me 8-12 hours every Friday 😅)
Reply to this email if you have any feedback or ideas for this newsletter.
Thanks for reading and talk to you next week!