Evaluate RAG systems with visual analytics
Evaluate reward models for math reasoning
Display genomic embedding leaderboard
Create demo spaces for models on Hugging Face
Create and upload a Hugging Face model card
Browse and submit language model benchmarks
Visualize model performance on function calling tasks
Explore and submit models using the LLM Leaderboard
Calculate memory usage for LLM models
Evaluate AI-generated results for accuracy
Merge Lora adapters with a base model
Push a ML model to Hugging Face Hub
Upload ML model to Hugging Face Hub
InspectorRAGet is a tool designed to evaluate and benchmark RAG (Retrieval-Augmented Generation) systems. It provides visual analytics and insights to help users understand the performance and behavior of their RAG models, enabling data-driven optimizations and improvements.
• Visual Analytics: Gain insights into RAG system performance through interactive visualizations.
• Benchmarking Capabilities: Compare multiple RAG models side-by-side to identify strengths and weaknesses.
• Efficient Evaluation: Streamline the evaluation process with automated workflows and reporting.
• Customizable Metrics: Define and track key performance indicators tailored to your needs.
• Integration Support: Easily integrate with popular RAG frameworks and tools.
pip install inspectrraget
.from inspectrraget importInspectorRAGet
What is a RAG system?
A Retrieval-Augmented Generation (RAG) system combines retrieval mechanisms (e.g., databases or search engines) with generative models (e.g., large language models) to produce more accurate and contextually relevant responses.
Can I customize the evaluation metrics?
Yes, InspectorRAGet allows you to define and use custom metrics to align with your specific evaluation goals.
How do I visualize the results?
InspectorRAGet provides built-in visualization tools that generate interactive charts and graphs. You can access these by calling the visualize()
method after running your queries.