Yes, it is publicly available at no cost through the Hugging Face datasets library.

How do I access the MMLU dataset?

Load it directly with load_dataset('cais/mmlu') from the Hugging Face datasets library.

What is the license for MMLU?

Refer to the dataset card on Hugging Face for current licensing and usage terms.

mmlu — Free Dataset Docs, Examples & Alternatives (2026)

What is mmlu?

MMLU is a collection of multiple-choice questions spanning 57 tasks in the humanities, social sciences, and sciences. It was introduced to test broad knowledge and reasoning in language models.

The benchmark is used by researchers to measure and compare model performance across many domains at once.

Data preview

A real sample from the dataset — 4 columns.

questionstring	subjectstring	choicesList	answerClassLabel
Find the degree for the given field extension Q(sqrt(2), sqrt(3), sqrt(18)) over Q.	abstract_algebra	["0","4","2","6"]	1
Let p = (1, 2, 5, 4)(2, 3) in S_5 . Find the index of <p> in S_5.	abstract_algebra	["8","2","24","120"]	2
Find all zeros in the indicated finite field of the given polynomial with coefficients in that field. x^5 + 3x^3 + x^2 + 2x in Z_5	abstract_algebra	["0","1","0,1","0,4"]	3
Statement 1 \| A factor group of a non-Abelian group is non-Abelian. Statement 2 \| If K is a normal subgroup of H and H is a normal subgroup of G, then K is a normal subgroup of G.	abstract_algebra	["True, True","False, False","True, False","False, True"]	1
Find the product of the given polynomials in the given polynomial ring. f(x) = 4x - 5, g(x) = 2x^2 - 4x + 2 in Z_8[x].	abstract_algebra	["2x^2 + 5","6x^2 + 4x + 6","0","x^2 + 1"]	1

Dataset structure

Total rows

231,400

Columns

4

Size on disk

98.8 MB

Subset	Split	Rows
abstract_algebra	test	116
abstract_algebra	validation	116
abstract_algebra	dev	116
all	test	115,700
all	validation	115,700
all	dev	115,700
all	auxiliary_train	115,700
anatomy	test	154
anatomy	validation	154
anatomy	dev	154
astronomy	test	173
astronomy	validation	173

What you can build with mmlu

LLM Benchmarking

Run standardized evaluations of language models across 57 subjects to measure knowledge breadth in humanities, sciences, and professions.

Zero-shot Performance Testing

Assess models on multiple-choice question answering without additional training using the built-in train/validation/test splits.

Subject-specific Analysis

Isolate individual subjects like mathematics or history to diagnose model strengths and weaknesses in targeted domains.

Load mmlu

Python

from datasets import load_dataset

ds = load_dataset("cais/mmlu")

1pip install datasets
2from datasets import load_dataset
3dataset = load_dataset('cais/mmlu')
4Select a subject subset such as dataset['auxiliary_train'] or specific test splits
5Parse each example's question, choices, and answer for evaluation scripts

mmlu: pros & cons

Pros

+Broad coverage of 57 subjects
+Large scale between 100K and 1M examples
+Multiple-choice format simplifies automated scoring
+Direct support for question-answering evaluation

Cons

–Restricted to multiple-choice questions only
–Subject splits must be handled manually
–Dataset size varies by subject

Did you find this helpful?

Frequently asked questions

A collection of multiple-choice questions spanning 57 academic and professional subjects for evaluating question-answering systems.

User reviews

Verified reviews from the community shape this listing's rating.

Loading reviews…

Sign in to review

Similar datasets

Other text & nlp options worth comparing.

KakologArchives

Text & NLP · KakologArchives

Verified

Archive of 11 years of Nico Nico Jikkyo live commentary logs.

Dataset↓ 1.8MFree

wikitext

Text & NLP · Salesforce

Verified

Over 100 million tokens from Wikipedia for language modeling benchmarks.

Dataset↓ 1.3MFree

gsm8k

Text & NLP · openai

Verified

8.5K grade school math word problems requiring multi-step arithmetic reasoning.

Dataset↓ 901KFree

mmlu

What is mmlu?

Data preview

Dataset structure

What you can build with mmlu

LLM Benchmarking

Zero-shot Performance Testing

Subject-specific Analysis

Load mmlu

mmlu: pros & cons

Pros

Cons

Frequently asked questions

What is MMLU?

Is MMLU free to use?

How do I access the MMLU dataset?

What is the license for MMLU?

User reviews

Similar datasets

KakologArchives

wikitext

gsm8k

Promote mmlu