Skip to content
OpenThoughts-1k-sample logo

OpenThoughts-1k-sample

Verified

1k sample of synthetic reasoning data across math, science, code, and puzzles.

DatasetAI & Machine Learning997K/moFree
Open dataset
Updated 2026-06-15

What is OpenThoughts-1k-sample?

OpenThoughts-1k-sample is a 1,000-row subset of the OpenThoughts-114k dataset of synthetic reasoning sequences. The data spans mathematics, science, programming tasks, and logic puzzles.

It is intended for researchers and practitioners training or evaluating language models on multi-domain reasoning. The subset directly supported fine-tuning of the OpenThinker-7B and OpenThinker-32B models.

What you can build with OpenThoughts-1k-sample

Fine-tune small reasoning models

Train or adapt 7B-scale models on the 1k synthetic traces to improve step-by-step performance in math and code tasks.

Prototype chain-of-thought evaluation

Run quick benchmarks that measure how well a model follows the provided reasoning paths across science and puzzle examples.

Create domain-specific data mixes

Blend the traces with other datasets to build balanced training sets covering mathematics, science, code, and logic puzzles.

Load OpenThoughts-1k-sample

Python
from datasets import load_dataset

ds = load_dataset("ryanmarten/OpenThoughts-1k-sample")
  1. 1pip install datasets
  2. 2from datasets import load_dataset
  3. 3ds = load_dataset('ryanmarten/OpenThoughts-1k-sample')
  4. 4Inspect the 'train' split and reasoning fields
  5. 5Use the traces directly for supervised fine-tuning

OpenThoughts-1k-sample: pros & cons

Pros

  • +Small 1k size enables fast local experiments
  • +Diverse coverage of math, science, code and puzzles
  • +Already validated by training OpenThinker-7B/32B
  • +Synthetic traces include explicit reasoning steps

Cons

  • Only 1,000 examples total
  • Synthetic data may contain model-specific artifacts
  • Subset of larger collection so topic coverage is limited
Did you find this helpful?

Frequently asked questions

A 1,000-example subset of synthetic reasoning traces in mathematics, science, code, and puzzles drawn from the OpenThoughts-114k collection.

User reviews

Verified reviews from the community shape this listing's rating.

Loading reviews…

Sign in to review

Promote OpenThoughts-1k-sample

Add this badge to your website, or share the tool.

DFeatured on DhanasviOpenThoughts-1k-sample 1