OpenThoughts-1k-sample
Verified1k sample of synthetic reasoning data across math, science, code, and puzzles.
What is OpenThoughts-1k-sample?
OpenThoughts-1k-sample is a 1,000-row subset of the OpenThoughts-114k dataset of synthetic reasoning sequences. The data spans mathematics, science, programming tasks, and logic puzzles.
It is intended for researchers and practitioners training or evaluating language models on multi-domain reasoning. The subset directly supported fine-tuning of the OpenThinker-7B and OpenThinker-32B models.
What you can build with OpenThoughts-1k-sample
Fine-tune small reasoning models
Train or adapt 7B-scale models on the 1k synthetic traces to improve step-by-step performance in math and code tasks.
Prototype chain-of-thought evaluation
Run quick benchmarks that measure how well a model follows the provided reasoning paths across science and puzzle examples.
Create domain-specific data mixes
Blend the traces with other datasets to build balanced training sets covering mathematics, science, code, and logic puzzles.
Load OpenThoughts-1k-sample
from datasets import load_dataset
ds = load_dataset("ryanmarten/OpenThoughts-1k-sample")- 1pip install datasets
- 2from datasets import load_dataset
- 3ds = load_dataset('ryanmarten/OpenThoughts-1k-sample')
- 4Inspect the 'train' split and reasoning fields
- 5Use the traces directly for supervised fine-tuning
OpenThoughts-1k-sample: pros & cons
Pros
- +Small 1k size enables fast local experiments
- +Diverse coverage of math, science, code and puzzles
- +Already validated by training OpenThinker-7B/32B
- +Synthetic traces include explicit reasoning steps
Cons
- –Only 1,000 examples total
- –Synthetic data may contain model-specific artifacts
- –Subset of larger collection so topic coverage is limited
Frequently asked questions
A 1,000-example subset of synthetic reasoning traces in mathematics, science, code, and puzzles drawn from the OpenThoughts-114k collection.
User reviews
Verified reviews from the community shape this listing's rating.
Loading reviews…