Evaluate open LLMs in the languages of LATAM and Spain.
Browse and evaluate ML tasks in MLIP Arena
Convert Stable Diffusion checkpoint to Diffusers and open a PR
Download a TriplaneGaussian model checkpoint
Visualize model performance on function calling tasks
Generate leaderboard comparing DNA models
Leaderboard of information retrieval models in French
Explore and manage STM32 ML models with the STM32AI Model Zoo dashboard
Display benchmark results
Evaluate reward models for math reasoning
Compare code model performance on benchmarks
View and submit LLM benchmark evaluations
Measure over-refusal in LLMs using OR-Bench
La Leaderboard is a model benchmarking tool designed to evaluate and compare open large language models (LLMs) in the languages of Latin America (LATAM) and Spain. It provides a comprehensive platform for researchers and developers to assess the performance of different language models across various tasks and languages, ensuring a tailored approach for the Spanish-speaking regions.
• Multilingual Support: Evaluate models in multiple languages across LATAM and Spain. • Customizable Benchmarks: Define specific tasks and metrics to suit your evaluation needs. • Interactive Dashboards: Visualize model performance through intuitive and detailed graphs. • Real-Time Tracking: Monitor model updates and compare their performance over time. • Comprehensive Reporting: Access detailed analysis and insights for each evaluated model. • Model Comparisons: Directly compare multiple models side-by-side.
What languages does La Leaderboard support?
La Leaderboard supports Spanish, Portuguese, and other languages widely spoken across Latin America and Spain.
How often are new models added to La Leaderboard?
New models are added regularly as they become available in the open LLM ecosystem.
Can I customize the benchmarks for specific tasks?
Yes, La Leaderboard allows users to define custom benchmarks tailored to their specific requirements.