Hello there, my fellow dreamers, and welcome to issue #40 of AI Art Weekly! 👋
I knew it was coming, but I didn’t expect it to be this severe. Twitter removed reading access on their free API tier, which I’ve been using to download and organize submissions for the weekly cover challenges. I quite like the format the challenges have on Twitter, which is why I’m looking into ways to crowdsource the monthly API expenses ($100/m) so I can keep them running. For starters I’ve setup some membership tiers on ko-fi in case someone has some spare change to help a fellow out 🧡
With this out of the way, lets take a look at the highlights of this week:
- ZeroScope v2 (XL) Text-to-Video models
- Unity AI
- Single Image to 3D
- Filtered-Guided Diffusion
- Interview with AIIA DAO founder MANΞKI NΞKO
- Official DragGAN demo released
- and more
Cover Challenge 🎨
News & Papers
Right when I started playing with Gen-1 and Gen-2 by RunwayML, a new text-to-video called ZeroScope XL got released. The model is based on ModelScope (issue #26) and is able to generate watermark free videos with a resolution of 1024x576 pixels. The results can be quiet impressive, but also cursed. The ZeroScope v2 workflow encourages exploration with a smaller model at a scale of 576x320 and then commit to an upscale with the XL model. To run it yourself, here are a few helpful links:
- Weights: zeroscope_v2_576w, zeroscope_v2_XL
- The weights above are compatible with the Autoamtic1111 sd-webui-text2video Extension
- Google Colab notebook by AILostMedia with a tutorial
- I’ve modified the notebook above to run with GPU’s on Runpod, which is overall cheaper if you want faster generations. Will put together a guide in the next week on how I use this.
- There are also two HuggingFace demos. One for regular text-to-video and one for image-caption to video.
Unity is working on a set of AI products that will help game developers ship games faster with generative AI. One of those projects is called Orb, a fully AI-driven character. If you have been following the newsletter, this isn’t super impressive, as anybody could already do this by combining Midjourney and SadTalker. But as always, usability is key here. Not a lot of people have the ability to pull this off yet, but once this gets integrated into the Unity Engine, things will kick off. If you’re interested in getting access to the tools, you can apply for beta access here.
New Midjourney feature:
Midjourney released a new feature flag called
--weird. The important parts from the release notes are:
- This command makes your images look more weird / edgy
- It goes from
3000but idk if you need it that weird)
- We recommend starting with smaller values such as
500and then go up/down from there
- If you want it to look weird AND pretty you should try using
--stylizealong with your weird
- If you use
--stylizewith weird we recommend using a equal value for both parameters
Image to 3D
Image-to-3D is getting good. One-2-3-45 is a method that is a able to generate full 360-degree 3D textured mesh in 45 seconds. Then there is also CSM, who is building their own foundation models which can turn any image into a 3D model. The results look very promising. You can try it out yourself by joining their Discord.
Filtered-Guided Diffusion: Fast Filter Guidance for Black-Box Diffusion Models
Filtered Guided Diffusion shows that image-to-image translation and editing doesn’t necessarily require additional training. FGD simply applies a filter to the input of each diffusion step based on the output of the previous step in an adaptive manner which makes this approach easy to implement.
DiffComplete: Diffusion-based Generative 3D Shape Completion
DiffComplete is an interesting method that is able to complete 3D objects from incomplete shapes. First thing that came to my mind when seeing this was how cool it would be to have an “autocomplete” feature in an editor like Cinema4D or Unity.
DreamDiffusion: Generating High-Quality Images from Brain EEG Signals
Research in the Thought-to-Image department is making progress. So far we’ve seen MinD-Vis and Mind-Video, which are able to transform fMRI signals into images. But this approach isn’t really feasible for us normies as equipment is expensive, non-portable and can only be operated by professionals. DreamDiffusion changes that. It uses EEG signals instead, which are non-invasive and can be recorded with “low-cost” portable commercial products. Dream-To-Video wen?
- FF3D: Free-style and Fast 3D Portrait Synthesis
- DetectorFreeSfM: Detector-Free Structure from Motion
- DreamDiffusion: Generating High-Quality Images from Brain EEG Signals
- MVDiffusion: Enabling Holistic Multi-view Image Generation with Correspondence-Aware Diffusion
With the proposal of DocT, I recently joined the AIIA DAO, a collective of artists collaborating in Web3 to support the AI Infused Art movement. This movement aims to promote acceptance of AI as a fine art creation tool. This week, I had the pleasure of interviewing its founder, MANΞKI NΞKO.
Tools & Tutorials
These are some of the most interesting resources I’ve come across this week.
And that my fellow dreamers, concludes yet another AI Art weekly issue. Please consider supporting this newsletter by:
- Sharing it 🙏❤️
- Following me on Twitter: @dreamingtulpa
- Buying me a coffee (I could seriously use it, putting these issues together takes me 8-12 hours every Friday 😅)
Reply to this email if you have any feedback or ideas for this newsletter.
Thanks for reading and talk to you next week!