Generate and view leaderboard for LLM evaluations
GIFT-Eval: A Benchmark for General Time Series Forecasting
Browse and filter ML model leaderboard data
Evaluate reward models for math reasoning
Calculate memory usage for LLM models
Open Persian LLM Leaderboard
Quantize a model for faster inference
Calculate memory needed to train AI models
Benchmark LLMs in accuracy and translation across languages
Calculate VRAM requirements for LLM models
Convert and upload model files for Stable Diffusion
Launch web-based model application
Upload ML model to Hugging Face Hub
The Arabic MMMLU Leaderborad is a platform designed to evaluate and compare the performance of large language models (LLMs) specifically for the Arabic language. It provides a comprehensive leaderboard that ranks models based on their performance across various tasks and metrics, offering insights into their capabilities and limitations.
What is the purpose of the Arabic MMMLU Leaderborad?
The platform aims to provide a standardized way to evaluate and compare Arabic language models, helping researchers and developers identify top-performing models for specific tasks.
How are models ranked on the leaderboard?
Models are ranked based on their performance across a variety of tasks and datasets. Rankings are updated regularly as new evaluations are conducted.
Can I submit my own model for evaluation?
Yes, the platform allows submissions from researchers and developers. Check the submission guidelines for requirements and instructions.