List of French datasets not referenced on the Hub
Colabora para conseguir un Carnaval de Cádiz más accesible
Upload files to a Hugging Face repository
Access NLPre-PL dataset and pre-trained models
Save user inputs to datasets on Hugging Face
Clean and process datasets
Manage and label your datasets
Create a large, deduplicated dataset for LLM pre-training
A collection of parsers for LLM benchmark datasets
Display html
Launch and explore labeled datasets
Browse and extract data from Hugging Face datasets
Jeux de données en français mal référencés sur le Hub is a curated list of French datasets that are not well-referenced or easily accessible on the Hub. This tool aims to provide researchers, data scientists, and enthusiasts with a comprehensive collection of datasets in the French language, focusing on those that are underserved or hard to find.
• Comprehensive Collection: A wide range of French datasets across various domains, including but not limited to natural language processing, social sciences, and more.
• Categorized Datasets: Datasets are organized by category for easier navigation and discovery.
• Search Functionality: Users can search for specific datasets using keywords or filters.
• Regular Updates: The list is frequently updated to include new or newly discovered datasets.
• Community-Driven: Contributions and suggestions from the community are welcome to expand the collection.
• Downloadable Formats: Datasets are available in multiple formats, ensuring compatibility with various tools and workflows.
• Version Tracking: Each dataset is tracked for updates, ensuring users always have access to the latest version.
1. What is the purpose of Jeux de données en français mal référencés sur le Hub?
The tool aims to centralize and make accessible French datasets that are hard to find or underrepresented on the Hub, providing a valuable resource for French-language data projects.
2. How often are new datasets added to the collection?
New datasets are added regularly, with updates typically occurring weekly or bi-weekly, depending on community contributions and discoveries.
3. Can I contribute a dataset to the collection?
Yes, contributions are welcome. If you know of a French dataset that is not well-referenced, you can submit it through the provided submission process on the platform.