Browse and submit LLM evaluations
Analyze model errors with interactive pages
Pergel: A Unified Benchmark for Evaluating Turkish LLMs
Explore and manage STM32 ML models with the STM32AI Model Zoo dashboard
Browse and filter machine learning models by category and modality
Compare code model performance on benchmarks
Predict customer churn based on input details
Evaluate LLM over-refusal rates with OR-Bench
Submit models for evaluation and view leaderboard
Calculate survival probability based on passenger details
View and submit LLM benchmark evaluations
View and submit LLM evaluations
Evaluate RAG systems with visual analytics
The Open Medical-LLM Leaderboard is a platform designed for benchmarking and comparing large language models (LLMs) specific to the medical domain. It provides a centralized space to evaluate and track the performance of various medical LLMs, enabling researchers and practitioners to identify the most suitable models for their specific use cases. The leaderboard is open and accessible, allowing users to browse evaluations and submit their own LLM assessments.
What is the purpose of Open Medical-LLM Leaderboard?
The purpose is to provide a transparent and accessible platform for benchmarking and comparing medical LLMs, helping users identify the best models for their specific applications.
How do I submit an evaluation for a new LLM?
Use the submission interface on the platform to upload your model and its evaluation results. Ensure compliance with the platform's guidelines and data requirements.
How often is the leaderboard updated?
The leaderboard is updated regularly as new models and evaluations are submitted. Follow the platform’s updates or notifications to stay informed about the latest additions.