Kokoro is an open-weight TTS model with 82 million parameters.
Generate realistic audio from text
Transcribe audio from microphone, file, or YouTube link
Convert spoken words into text
Transcribe or translate audio and YouTube videos
High-fidelity Text-To-Speech
Turn text into speech with customizable voice, rate, and pitch
Request evaluation of a speech recognition model
Efficient, fast, and natural text to speech with StyleTTS 2!
Generate realistic voices from text
Generate anime character speech from text
Generate speech from text
Fast, efficient, & multilingual text-to-speech
Spanish finetune for the original F5 model.
Transcribe voice to text
Convert text to speech with Next-gen Kaldi
Realtime implementation of Whisper large turbo
Text to Audio (Sound SFX) Generator
ExpressivText-to-Speech
Generate audio from text or modify voice pitch