Measure execution times of BERT models using WebGPU and WASM
Calculate memory usage for LLM models
Export Hugging Face models to ONNX
Compare and rank LLMs using benchmark scores
Convert and upload model files for Stable Diffusion
Browse and filter ML model leaderboard data
Browse and evaluate ML tasks in MLIP Arena
View NSQL Scores for Models
Browse and evaluate language models
Demo of the new, massively multilingual leaderboard
Merge machine learning models using a YAML configuration file
Run benchmarks on prediction models
Download a TriplaneGaussian model checkpoint
The WebGPU Embedding Benchmark is a tool designed to measure the execution times of BERT models using WebGPU and WebAssembly (WASM). It provides a comprehensive way to evaluate and compare the performance of embedding models across different frameworks and configurations. By leveraging WebGPU's advanced capabilities, the benchmark helps developers optimize their machine learning workflows for better efficiency and speed.
npm install
npm start
What is BERT embeddings and why is it important?
BERT (Bidirectional Encoder Representations from Transformers) embeddings are vector representations of text that capture semantic meaning. They are widely used in natural language processing tasks for improved model accuracy and efficiency.
How do I interpret the benchmark results?
Results show execution times (e.g., inference time per batch) and other metrics. Lower times indicate better performance. Use these metrics to compare frameworks, models, or hardware configurations.
Which frameworks are supported?
The benchmark supports popular frameworks like TensorFlow, PyTorch, and ONNX. Additional frameworks can be added through configuration or plugins.