Speech Corpus Creation Tool
Generate dataset for machine learning
Build datasets and workflows using AI models
Find and view synthetic data pipelines on Hugging Face
Convert a model to Safetensors and open a PR
Display trending datasets from Hugging Face
Data annotation for Sparky
Display trending datasets and spaces
Explore and manage datasets for machine learning
Build datasets using natural language
Create a large, deduplicated dataset for LLM pre-training
Create a report in BoAmps format
Browse and extract data from Hugging Face datasets
Dhravani is a Speech Corpus Creation Tool designed to help users create high-quality speech datasets by recording voices and transcribing them. It simplifies the process of building speech corpora, making it accessible for both researchers and developers.
• User-Friendly Interface: Designed for ease of use, allowing users to record and transcribe audio seamlessly.
• High-Quality Audio Recording: Ensures clear and accurate voice recordings for better dataset quality.
• Automatic Transcription: Converts recorded audio into text, saving time and effort.
• Data Organization: Manages recorded audio and transcriptions in an organized structure for easy access.
• Export Capabilities: Allows users to export datasets in various formats for further analysis or model training.
What is a speech corpus?
A speech corpus is a collection of speech data used to train and test speech recognition systems or other AI models.
Can I use Dhravani for multiple languages?
Yes, Dhravani supports multiple languages, allowing users to create diverse speech datasets.
Is my recorded data secure?
Dhravani ensures that all recordings and transcriptions are stored securely, with options for encryption and privacy protection.