Kokoro is an open-weight TTS model with 82 million parameters.
Transcribe audio or YouTube videos into text
Simple Space for the Kokoro Model
Convert speech to text from audio files
F5-TTS & E2-TTS: Zero-Shot Voice Cloning (Unofficial Demo)
Transcribe YouTube videos to text
Enhance your audio quality by removing noise
Spanish finetune for the original F5 model.
High-fidelity Text-To-Speech
Transcribe Persian audio files into text
Generate audio from text input
Convert spoken words to text
WebGPU text-to-Speech powered by OuteTTS and Transformers.js
Kokoro TTS is an advanced text-to-speech (TTS) tool designed to generate high-quality audio from text. It utilizes cutting-edge AI technology to deliver natural-sounding speech in multiple voices. Version 1.0 of Kokoro TTS introduces enhanced features and improvements, making it a robust solution for speech synthesis tasks.
• Multiple Voices: Choose from a variety of voices to customize the output.
• SSML Support: Fine-tune the speech output using Speech Synthesis Markup Language.
• High-Quality Audio: Generate clear and natural-sounding audio files.
• Customization: Adjust settings like pitch, speed, and tone to suit your needs.
• Integration: Easily integrate with other applications for seamless workflows.
What formats does Kokoro TTS support?
Kokoro TTS supports WAV, MP3, and other common audio formats for output.
Can I use Kokoro TTS for commercial purposes?
Yes, Kokoro TTS is suitable for both personal and commercial use, depending on the licensing agreement.
How many voices are available in Kokoro TTS?
The number of voices varies, but Kokoro TTS offers a diverse range of voices in different languages and tones.