AIDir.app
  • Hot AI Tools
  • New AI Tools
  • AI Tools Category
AIDir.app
AIDir.app

Save this website for future use! Free to use, no login required.

About

  • Blog

© 2025 • AIDir.app All rights reserved.

  • Privacy Policy
  • Terms of Service
Home
Model Benchmarking
Hallucinations Leaderboard

Hallucinations Leaderboard

View and submit LLM evaluations

You May Also Like

View All
😻

2025 AI Timeline

Browse and filter machine learning models by category and modality

56
🐶

Convert HF Diffusers repo to single safetensors file V2 (for SDXL / SD 1.5 / LoRA)

Convert Hugging Face model repo to Safetensors

8
⚔

MTEB Arena

Teach, test, evaluate language models with MTEB Arena

103
🔀

mergekit-gui

Merge machine learning models using a YAML configuration file

269
👀

Model Drops Tracker

Find recent high-liked Hugging Face models

33
🐠

WebGPU Embedding Benchmark

Measure execution times of BERT models using WebGPU and WASM

60
📏

Cetvel

Pergel: A Unified Benchmark for Evaluating Turkish LLMs

16
🥇

DécouvrIR

Leaderboard of information retrieval models in French

11
🎙

ConvCodeWorld

Evaluate code generation with diverse feedback types

0
📈

GGUF Model VRAM Calculator

Calculate VRAM requirements for LLM models

33
🏅

Open Persian LLM Leaderboard

Open Persian LLM Leaderboard

60
🏆

🌐 Multilingual MMLU Benchmark Leaderboard

Display and submit LLM benchmarks

12

What is Hallucinations Leaderboard ?

Hallucinations Leaderboard is a tool designed for evaluating and benchmarking large language models (LLMs). It provides a platform to view and submit evaluations of model performance, with a focus on understanding and mitigating hallucinations—instances where models produce inaccurate or non-factual information.

Features

• Leaderboard System: Compare performance of different LLMs based on hallucination metrics. • Benchmarking Tools: Access standardized tests and evaluations for assessing model accuracy. • Customizable Metrics: Define and apply specific criteria for measuring hallucinations. • Model Comparison: Directly compare multiple models side-by-side. • Submission Interface: Easily submit your own evaluations for inclusion in the leaderboard. • Filtering and Sorting: Narrow down results by model size, architecture, or performance thresholds. • Real-Time Updates: Stay current with the latest evaluations and benchmarks.

How to use Hallucinations Leaderboard ?

  1. Visit the Platform: Go to the Hallucinations Leaderboard website.
  2. Explore Models: Browse the leaderboard to view evaluated models and their performance.
  3. Compare Models: Use the comparison feature to analyze multiple models simultaneously.
  4. Submit Evaluations: If you have conducted evaluations, use the submission interface to add your results.
  5. Analyze Metrics: Dive into detailed metrics to understand hallucination patterns and model accuracy.
  6. Filter Results: Apply filters to narrow down models based on specific criteria.
  7. Stay Updated: Check back regularly for new evaluations and updated rankings.

Frequently Asked Questions

What is the purpose of Hallucinations Leaderboard?
The purpose is to provide a centralized platform for evaluating and comparing LLMs, with a focus on reducing hallucinations and improving model accuracy.

How do I submit my own evaluations?
To submit evaluations, use the submission interface on the platform. Ensure your results align with the defined metrics and criteria.

Why is tracking hallucinations important?
Hallucinations can lead to misinformation. Tracking them helps improve model reliability and trustworthiness in real-world applications.

Recommended Category

View All
📊

Convert CSV data into insights

🖌️

Generate a custom logo

📋

Text Summarization

🗂️

Dataset Creation

😊

Sentiment Analysis

🕺

Pose Estimation

🌈

Colorize black and white photos

🎵

Music Generation

🖼️

Image Captioning

❓

Visual QA

⬆️

Image Upscaling

🧹

Remove objects from a photo

💡

Change the lighting in a photo

​🗣️

Speech Synthesis

🔤

OCR