Evaluate Persian LLMs on various tasks
Evaluate RAG systems with visual analytics
Benchmark LLMs in accuracy and translation across languages
Display leaderboard of language model evaluations
Convert Stable Diffusion checkpoint to Diffusers and open a PR
Explore and benchmark visual document retrieval models
View and submit LLM benchmark evaluations
Evaluate LLM over-refusal rates with OR-Bench
Pergel: A Unified Benchmark for Evaluating Turkish LLMs
Explain GPU usage for model training
Benchmark AI models by comparison
View and submit machine learning model evaluations
Display benchmark results
The ๐ค Persian LLM Leaderboard is a comprehensive resource for evaluating and comparing Persian language models (LLMs) across various tasks and metrics. It provides a centralized platform for researchers, developers, and users to assess the performance of different models and make informed decisions based on their needs. The leaderboard is designed to promote transparency and innovation in the field of Persian natural language processing.
โข Model Performance Tracking: Detailed performance metrics for various Persian LLMs on tasks like text classification, summarization, and question answering.
โข Task-Specific Benchmarking: Evaluation across a wide range of NLP tasks tailored to the Persian language.
โข Comparative Analysis: Side-by-side comparison of models to identify strengths and weaknesses.
โข Regular Updates: Continuous updates with new models, tasks, and metrics.
โข Open Accessibility: Available to everyone, including researchers, developers, and enthusiasts.
โข Documentation and Resources: Access to datasets, evaluation scripts, and best practices for benchmarking.
What models are included in the leaderboard?
The leaderboard includes a variety of Persian language models, ranging from smaller, efficient models to larger, state-of-the-art architectures. Models are added continuously as they are developed and benchmarked.
How are models rated or ranked?
Models are ranked based on their performance on specific tasks and metrics. The ranking is determined by evaluation results on standardized datasets and may vary depending on the task or metric being considered.
How often is the leaderboard updated?
The leaderboard is updated regularly to include new models, tasks, and metrics. Updates are typically announced on the official platform or through associated communication channels.