Measure execution times of BERT models using WebGPU and WASM
Convert Hugging Face models to OpenVINO format
Calculate GPU requirements for running LLMs
Calculate VRAM requirements for LLM models
Display and submit language model evaluations
Download a TriplaneGaussian model checkpoint
Convert PaddleOCR models to ONNX format
Evaluate adversarial robustness using generative models
Compare and rank LLMs using benchmark scores
Merge machine learning models using a YAML configuration file
Run benchmarks on prediction models
Explore GenAI model efficiency on ML.ENERGY leaderboard
Browse and evaluate language models
The WebGPU Embedding Benchmark is a tool designed to measure the execution times of BERT models using WebGPU and WebAssembly (WASM). It provides a comprehensive way to evaluate and compare the performance of embedding models across different frameworks and configurations. By leveraging WebGPU's advanced capabilities, the benchmark helps developers optimize their machine learning workflows for better efficiency and speed.
npm install
npm start
What is BERT embeddings and why is it important?
BERT (Bidirectional Encoder Representations from Transformers) embeddings are vector representations of text that capture semantic meaning. They are widely used in natural language processing tasks for improved model accuracy and efficiency.
How do I interpret the benchmark results?
Results show execution times (e.g., inference time per batch) and other metrics. Lower times indicate better performance. Use these metrics to compare frameworks, models, or hardware configurations.
Which frameworks are supported?
The benchmark supports popular frameworks like TensorFlow, PyTorch, and ONNX. Additional frameworks can be added through configuration or plugins.