Audio AI Tools
Free audio AI tools for sound design, music composition, and voice synthesis, helping creatives produce unique audio experiences effortlessly.
WavJourney is a system that uses large language models to generate audio content with storylines encompassing speech, music, and sound effects guided from text instructions. The demo results, while not perfect, sound great.
CoMoSpeech can synthesize speech and singing voices in one step with high audio quality. It runs over 150 times faster than real-time on a single NVIDIA A100 GPU, making it practical for text-to-speech and singing applications.
Msanii can create high-quality music tracks up to 190 seconds long at a sample rate of 44.1 kHz. It uses a diffusion-based method to combine mel spectrograms and neural vocoders, allowing for audio-to-audio style transfer and smooth transitions between audio samples.
I Hear Your True Colors: Image Guided Audio Generation can generate audio that matches images using a two-stage Transformer model. It produces high-quality sound and introduces the ImageHear dataset for testing future image-to-audio models.
AudioLM can generate high-quality audio by treating it like a language task. It produces coherent speech and piano music continuations while keeping the speaker’s voice and style consistent, even for new speakers.