View NSQL Scores for Models
Submit deepfake detection models for evaluation
Explain GPU usage for model training
Explore and benchmark visual document retrieval models
Evaluate and submit AI model results for Frugal AI Challenge
Text-To-Speech (TTS) Evaluation using objective metrics.
Browse and evaluate ML tasks in MLIP Arena
Display leaderboard for earthquake intent classification models
Browse and submit LLM evaluations
Display and submit LLM benchmarks
View and submit LLM evaluations
Evaluate LLM over-refusal rates with OR-Bench
Pergel: A Unified Benchmark for Evaluating Turkish LLMs
DuckDB NSQL Leaderboard is a tool designed for model benchmarking, allowing users to view and compare NSQL scores for different models. It provides a centralized platform to evaluate and visualize performance metrics, making it easier to analyze and optimize model effectiveness.
• Real-time Tracking:Monitor NSQL scores as they are updated in real-time.
• Interactive Visualization:Explore performance metrics through graphs and charts.
• Filter and Sort:Easily filter and sort models based on criteria like score, date, or model type.
• Side-by-Side Comparison:Compare multiple models directly to identify strengths and weaknesses.
• Data Export:Export leaderboard data for further analysis or reporting.
What does NSQL stand for?
NSQL stands for Natural Language Query Benchmark, a measure of how well a model understands and processes natural language.
How are NSQL scores calculated?
NSQL scores are calculated by evaluating a model's ability to process and answer natural language queries, often based on standardized benchmarks.
Can I export the leaderboard data for my own analysis?
Yes, the DuckDB NSQL Leaderboard provides options to export data in formats like CSV or Excel for further analysis.