Is SWE-bench_Multilingual free to use?

Yes, it is publicly available on the Hugging Face Hub at no cost.

How do I access the dataset?

Load it directly with the Hugging Face datasets library using load_dataset('SWE-bench/SWE-bench_Multilingual').

Follows the original SWE-bench license; check the repository card for exact terms.

SWE-bench_Multilingual — Free Dataset Docs, Examples & Alternatives (2026)

What is SWE-bench_Multilingual?

SWE-bench_Multilingual provides a collection of GitHub issues and associated code repositories for evaluating models on multilingual software engineering tasks.

It supports benchmark evaluations in NLP and code generation for researchers focused on multilingual capabilities in software issue resolution.

What you can build with SWE-bench_Multilingual

Benchmarking multilingual code agents

Measure how well LLMs resolve GitHub issues across non-English repositories and programming languages.

Training cross-lingual repair models

Fine-tune models on issue-to-patch pairs from multiple natural languages to improve generalization.

Comparing language-specific performance

Run controlled experiments to quantify accuracy gaps between English and other language codebases.

Load SWE-bench_Multilingual

Python

from datasets import load_dataset

ds = load_dataset("SWE-bench/SWE-bench_Multilingual")

1pip install datasets
2from datasets import load_dataset
3ds = load_dataset('SWE-bench/SWE-bench_Multilingual')
4print(ds['test'][0])
5Use the 'instance_id', 'problem_statement' and 'patch' fields for evaluation

SWE-bench_Multilingual: pros & cons

Pros

+Extends SWE-bench to non-English languages
+Real GitHub issues and patches
+Directly loadable via Hugging Face
+Supports standardized model comparisons

Cons

–Evaluation requires full repository setup
–Limited documentation on language coverage
–High compute cost for full runs

Did you find this helpful?

Frequently asked questions

A multilingual version of the SWE-bench dataset containing real software engineering tasks from GitHub issues in multiple languages.

User reviews

Verified reviews from the community shape this listing's rating.

Loading reviews…

Sign in to review

Similar datasets

Other text & nlp options worth comparing.

KakologArchives

Text & NLP · KakologArchives

Verified

Archive of 11 years of Nico Nico Jikkyo live commentary logs.

Dataset↓ 1.8MFree

wikitext

Text & NLP · Salesforce

Verified

Over 100 million tokens from Wikipedia for language modeling benchmarks.

Dataset↓ 1.3MFree

gsm8k

Text & NLP · openai

Verified

8.5K grade school math word problems requiring multi-step arithmetic reasoning.

Dataset↓ 901KFree

SWE-bench_Multilingual

What is SWE-bench_Multilingual?

What you can build with SWE-bench_Multilingual

Benchmarking multilingual code agents

Training cross-lingual repair models

Comparing language-specific performance

Load SWE-bench_Multilingual

SWE-bench_Multilingual: pros & cons

Pros

Cons

Frequently asked questions

What is SWE-bench_Multilingual?

Is SWE-bench_Multilingual free to use?

How do I access the dataset?

What is the license?

User reviews

Similar datasets

KakologArchives

wikitext

gsm8k

Promote SWE-bench_Multilingual