AI Toolbox · Audio

Speech Recognition

Free speech recognition AI tools for transcribing audio, enhancing accessibility, and improving communication in creative projects and research.

AnCoGen

AnCoGen can analyze and generate speech by estimating key attributes like speaker identity, pitch, and loudness. It can also perform tasks such as speech denoising, pitch shifting, and voice conversion using a unified masked autoencoder model.

11.03.25 · Project Page · Code · Speech Recognition · Text-to-Speech

Separate Anything You Describe

AudioSep can separate audio events and musical instruments while enhancing speech using natural language queries. It performs well in open-domain audio source separation, significantly surpassing previous models.

09.08.23 · Project Page · Code · Demo · Audio Classification · Speech Recognition