Evaluate RAG systems with visual analytics
Analyze model errors with interactive pages
Persian Text Embedding Benchmark
Track, rank and evaluate open LLMs and chatbots
Predict customer churn based on input details
Measure over-refusal in LLMs using OR-Bench
Browse and submit model evaluations in LLM benchmarks
View and submit machine learning model evaluations
Create and upload a Hugging Face model card
Merge Lora adapters with a base model
Pergel: A Unified Benchmark for Evaluating Turkish LLMs
Evaluate model predictions with TruLens
Multilingual Text Embedding Model Pruner
InspectorRAGet is a tool designed to evaluate and benchmark RAG (Retrieval-Augmented Generation) systems. It provides visual analytics and insights to help users understand the performance and behavior of their RAG models, enabling data-driven optimizations and improvements.
• Visual Analytics: Gain insights into RAG system performance through interactive visualizations.
• Benchmarking Capabilities: Compare multiple RAG models side-by-side to identify strengths and weaknesses.
• Efficient Evaluation: Streamline the evaluation process with automated workflows and reporting.
• Customizable Metrics: Define and track key performance indicators tailored to your needs.
• Integration Support: Easily integrate with popular RAG frameworks and tools.
pip install inspectrraget
.from inspectrraget importInspectorRAGet
What is a RAG system?
A Retrieval-Augmented Generation (RAG) system combines retrieval mechanisms (e.g., databases or search engines) with generative models (e.g., large language models) to produce more accurate and contextually relevant responses.
Can I customize the evaluation metrics?
Yes, InspectorRAGet allows you to define and use custom metrics to align with your specific evaluation goals.
How do I visualize the results?
InspectorRAGet provides built-in visualization tools that generate interactive charts and graphs. You can access these by calling the visualize()
method after running your queries.