DS-1000 supplies 1,000 data science coding problems in simplified text format.
DS-1000 is a collection of approximately one thousand data science coding problems released in simplified text format for model evaluation.
It is useful for researchers running benchmarks and comparing language models on realistic data science code tasks.
Run generated code against the 1000 problems to measure pass rates and compare different LLMs or fine-tuned models.
Integrate the simplified test cases into CI workflows that score model outputs on functional correctness.
Inspect incorrect solutions across problem categories to identify patterns where code generation models struggle.
from datasets import load_dataset
ds = load_dataset("xlangai/DS-1000")A dataset of roughly one thousand problems for code generation evaluation, distributed in simplified format on Hugging Face.
Verified reviews from the community shape this listing's rating.
Loading reviews…