Synthetic Data Generator
Build datasets using natural language
Convert to Safetensors
Convert and PR models to Safetensors
TxT360: Trillion Extracted Text
Create a large, deduplicated dataset for LLM pre-training
PDF to Dataset
Convert PDFs to a dataset and upload to Hugging Face
Datasets Tagging
Create and validate structured metadata for datasets
Research Tracker
Semantic Hugging Face Hub Search
Search and find similar datasets
Datasets Convertor
Support by Parquet, CSV, Jsonl, XLS
Space to Dataset Saver
Save user inputs to datasets on Hugging Face
Domain Specific Seed
Create a domain-specific dataset project
Reddit Dataset Creator
Create Reddit dataset
SmolVLM2 IPhone Waitlist
sign in to receive news on the iPhone app
gradio_huggingfacehub_search V0.0.7
Search for Hugging Face Hub models
Distilabel Synthetic Data Pipeline Finder
Find and view synthetic data pipelines on Hugging Face
Open LLM Leaderboard Renamer
Rename models in dataset leaderboard
Dataset ReWriter
ReWrite datasets with a text instruction
Recent Hugging Face Datasets
Explore recent datasets from Hugging Face Hub
Collection Dataset Explorer
Browse and view Hugging Face datasets
Distilabel Dataset Generator
Create datasets with FAQs and SFT prompts
Trending Repos
Display trending datasets from Hugging Face