Browse and submit LLM evaluations
Evaluate AI-generated results for accuracy
Load AI models and prepare your space
Evaluate model predictions with TruLens
View and submit LLM evaluations
Compare LLM performance across benchmarks
Persian Text Embedding Benchmark
Rank machines based on LLaMA 7B v2 benchmark results
View RL Benchmark Reports
Display genomic embedding leaderboard
Upload ML model to Hugging Face Hub
Compare code model performance on benchmarks
Request model evaluation on COCO val 2017 dataset
The Open Medical-LLM Leaderboard is a platform designed for benchmarking and comparing large language models (LLMs) specific to the medical domain. It provides a centralized space to evaluate and track the performance of various medical LLMs, enabling researchers and practitioners to identify the most suitable models for their specific use cases. The leaderboard is open and accessible, allowing users to browse evaluations and submit their own LLM assessments.
What is the purpose of Open Medical-LLM Leaderboard?
The purpose is to provide a transparent and accessible platform for benchmarking and comparing medical LLMs, helping users identify the best models for their specific applications.
How do I submit an evaluation for a new LLM?
Use the submission interface on the platform to upload your model and its evaluation results. Ensure compliance with the platform's guidelines and data requirements.
How often is the leaderboard updated?
The leaderboard is updated regularly as new models and evaluations are submitted. Follow the platform’s updates or notifications to stay informed about the latest additions.