Evaluate RAG systems with visual analytics
Evaluate adversarial robustness using generative models
Download a TriplaneGaussian model checkpoint
Evaluate code generation with diverse feedback types
Compare model weights and visualize differences
Demo of the new, massively multilingual leaderboard
Display and submit LLM benchmarks
Evaluate model predictions with TruLens
Browse and submit LLM evaluations
Push a ML model to Hugging Face Hub
Multilingual Text Embedding Model Pruner
Compare code model performance on benchmarks
Request model evaluation on COCO val 2017 dataset
InspectorRAGet is a tool designed to evaluate and benchmark RAG (Retrieval-Augmented Generation) systems. It provides visual analytics and insights to help users understand the performance and behavior of their RAG models, enabling data-driven optimizations and improvements.
• Visual Analytics: Gain insights into RAG system performance through interactive visualizations.
• Benchmarking Capabilities: Compare multiple RAG models side-by-side to identify strengths and weaknesses.
• Efficient Evaluation: Streamline the evaluation process with automated workflows and reporting.
• Customizable Metrics: Define and track key performance indicators tailored to your needs.
• Integration Support: Easily integrate with popular RAG frameworks and tools.
pip install inspectrraget
.from inspectrraget importInspectorRAGet
What is a RAG system?
A Retrieval-Augmented Generation (RAG) system combines retrieval mechanisms (e.g., databases or search engines) with generative models (e.g., large language models) to produce more accurate and contextually relevant responses.
Can I customize the evaluation metrics?
Yes, InspectorRAGet allows you to define and use custom metrics to align with your specific evaluation goals.
How do I visualize the results?
InspectorRAGet provides built-in visualization tools that generate interactive charts and graphs. You can access these by calling the visualize()
method after running your queries.