Speech Recognition
Free speech recognition AI tools for transcribing audio, enhancing accessibility, and improving communication in creative projects and research.
Audio AI Tools
Audio Captioning
Audio Classification
Audio Editing
Audio Generation
Audio Inpainting
Audio Outpainting / Continuation
Audio Separation
Audio-to-3D
Audio-to-Motion
Audio-to-Text
Controllable Audio Generation
Image-to-Audio
Personalized Audio Generation
Speech Recognition
Text-to-Audio
Text-to-Audio
Image-to-Audio
Video-to-Audio
Text-to-Music
Text-to-Music
Text-to-SFX
Text-to-SFX
Text-to-Speech
Video-to-Audio
AnCoGen can analyze and generate speech by estimating key attributes like speaker identity, pitch, and loudness. It can also perform tasks such as speech denoising, pitch shifting, and voice conversion using a unified masked autoencoder model.
AudioSep can separate audio events and musical instruments while enhancing speech using natural language queries. It performs well in open-domain audio source separation, significantly surpassing previous models.