Generate and view leaderboard for LLM evaluations
Browse and filter machine learning models by category and modality
Calculate survival probability based on passenger details
View and submit LLM evaluations
Track, rank and evaluate open LLMs and chatbots
Measure BERT model performance using WASM and WebGPU
Optimize and train foundation models using IBM's FMS
Browse and evaluate ML tasks in MLIP Arena
Convert Hugging Face models to OpenVINO format
Visualize model performance on function calling tasks
Evaluate open LLMs in the languages of LATAM and Spain.
Display genomic embedding leaderboard
View and submit LLM benchmark evaluations
The Arabic MMMLU Leaderborad is a platform designed to evaluate and compare the performance of large language models (LLMs) specifically for the Arabic language. It provides a comprehensive leaderboard that ranks models based on their performance across various tasks and metrics, offering insights into their capabilities and limitations.
What is the purpose of the Arabic MMMLU Leaderborad?
The platform aims to provide a standardized way to evaluate and compare Arabic language models, helping researchers and developers identify top-performing models for specific tasks.
How are models ranked on the leaderboard?
Models are ranked based on their performance across a variety of tasks and datasets. Rankings are updated regularly as new evaluations are conducted.
Can I submit my own model for evaluation?
Yes, the platform allows submissions from researchers and developers. Check the submission guidelines for requirements and instructions.