Pergel: A Unified Benchmark for Evaluating Turkish LLMs
Leaderboard of information retrieval models in French
Display and submit LLM benchmarks
Evaluate adversarial robustness using generative models
Browse and submit LLM evaluations
Evaluate open LLMs in the languages of LATAM and Spain.
Generate leaderboard comparing DNA models
Display leaderboard for earthquake intent classification models
Explore and visualize diverse models
View and submit language model evaluations
Download a TriplaneGaussian model checkpoint
Calculate survival probability based on passenger details
Compare audio representation models using benchmark results
Cetvel is a unified benchmark tool designed for evaluating Turkish Large Language Models (LLMs). It provides a comprehensive framework to analyze and compare the performance of different models across a variety of tasks, making it an essential tool for researchers and developers working with Turkish NLP tasks.
• Comprehensive Task Coverage: Evaluate models on tasks such as translation, summarization, and question-answering specific to the Turkish language.
• Customizable Benchmarks: Create tailored benchmarking suites to focus on specific aspects of model performance.
• Cross-Model Comparisons: Compare multiple Turkish LLMs side-by-side to identify strengths and weaknesses.
• Detailed Reporting: Generate in-depth reports highlighting model accuracy, efficiency, and robustness.
• Integration with Popular LLMs: Supports integration with widely-used Turkish and multilingual LLMs.
What models are supported by Cetvel?
Cetvel supports a wide range of Turkish and multilingual LLMs, including but not limited to, models from leading NLP libraries.
Do I need NLP expertise to use Cetvel?
No, Cetvel is designed to be user-friendly. However, basic knowledge of NLP concepts may help in interpreting results.
Can I benchmark models in languages other than Turkish?
Cetvel is primarily optimized for Turkish, but it can be adapted for other languages with additional configuration.