Browse and submit model evaluations in LLM benchmarks
Convert Stable Diffusion checkpoint to Diffusers and open a PR
Evaluate code generation with diverse feedback types
Compare audio representation models using benchmark results
Calculate VRAM requirements for LLM models
Create demo spaces for models on Hugging Face
SolidityBench Leaderboard
Browse and evaluate ML tasks in MLIP Arena
Explore and benchmark visual document retrieval models
Request model evaluation on COCO val 2017 dataset
Evaluate RAG systems with visual analytics
Multilingual Text Embedding Model Pruner
View and submit LLM benchmark evaluations
The OpenLLM Turkish leaderboard v0.2 is a tool designed to evaluate and benchmark large language models (LLMs) for the Turkish language. It provides a platform for developers and researchers to submit and compare model evaluations across various tasks and metrics specific to Turkish. This leaderboard aims to promote transparency and progress in Turkish NLP by enabling fair comparisons of model performance.
What models are supported on the leaderboard?
The leaderboard supports a variety of LLMs, including popular models like T5, BERT, and specialized Turkish models.
How are models evaluated?
Models are evaluated based on standard NLP tasks such as text classification, question answering, and language translation, using precision, recall, BLEU score, and other relevant metrics.
How often is the leaderboard updated?
The leaderboard is updated regularly with new models, datasets, and features to reflect the latest advancements in Turkish NLP.