results
VerifiedMTEB benchmark results for text embedding model evaluations.
What is results?
It is a tabular collection of MTEB evaluation outputs listing model identifiers, task names, and numeric metrics.
It is useful for researchers comparing embedding model rankings and for meta-analyses of benchmark distributions.
What you can build with results
Model selection for production
Filter results by task type and metric to identify top-performing embedding models for specific downstream applications like semantic search or classification.
Trend analysis over time
Aggregate scores across multiple model releases to track improvements in embedding quality on shared benchmarks without needing to rerun evaluations.
Baseline reporting in papers
Query historical scores for established models to include standardized comparisons when publishing new embedding techniques or architectures.
Load results
from datasets import load_dataset
ds = load_dataset("mteb/results")- 1pip install datasets
- 2from datasets import load_dataset
- 3ds = load_dataset('mteb/results')
- 4df = ds['train'].to_pandas()
- 5Filter rows by model name or task columns for analysis
results: pros & cons
Pros
- +millions of pre-computed scores ready for analysis
- +standardized benchmark results across many models
- +directly loadable via Hugging Face datasets
- +raw tabular format supports custom queries
Cons
- –lacks task descriptions or metadata annotations
- –very large size may require filtering or sampling
- –no built-in visualization or aggregation tools
Frequently asked questions
It contains raw evaluation scores from the Massive Text Embedding Benchmark covering model performance on embedding tasks.
User reviews
Verified reviews from the community shape this listing's rating.
Loading reviews…