Skip to content
SynData logo

SynData

Verified

Large-scale multimodal dataset covering vision, language, and action data.

DatasetText & NLP512K/moFree
Open dataset
Updated 2026-06-15

What is SynData?

SynData is a large-scale real-world multimodal dataset that covers vision, language, and action.

It supplies human data for embodied intelligence training and is intended for machine learning researchers working on multimodal models.

Data preview

A real sample from the dataset — 14 columns.

subsetstringclip_idstringtask_keystringtask_namestringvolume_idstringrel_pathstring
egoclip_pfmzgf2s2dwgrmxgex37task_0001Sort clothes000001tasks/task_0001/000001.zarr
egoclip_xl4kfibjax442gnvmqaxtask_0001Sort clothes000001tasks/task_0001/000001.zarr
egoclip_elhvafiqotdzim5uybsstask_0001Sort clothes000001tasks/task_0001/000001.zarr
egoclip_byhszwgx2oju3tvv3ss7task_0001Sort clothes000001tasks/task_0001/000001.zarr
egoclip_6d4b6ka7arh3354nlho4task_0001Sort clothes000001tasks/task_0001/000001.zarr

Dataset structure

Total rows
449,363
Columns
14
Size on disk
24.4 MB
SubsetSplitRows
all_clipstrain449,363

What you can build with SynData

Train multimodal language models

Develop models that jointly process language with vision and action sequences from real human interactions for improved context understanding.

Build robotics simulation agents

Create agents that learn action prediction alongside language generation using the dataset's combined vision-language-action samples.

Evaluate cross-modal transfer

Test how well NLP models generalize when fine-tuned on multimodal human data spanning 100k-1M samples.

Load SynData

Python
from datasets import load_dataset

ds = load_dataset("PsiBotAI/SynData")
  1. 1pip install datasets
  2. 2from datasets import load_dataset
  3. 3dataset = load_dataset('psibotai/syndata')
  4. 4print(dataset.features) to inspect modalities
  5. 5Split into train/test and preprocess for your pipeline

SynData: pros & cons

Pros

  • +Multimodal coverage across vision, language, and action
  • +Real-world human data samples
  • +Accessible through Hugging Face datasets library
  • +Size range supports mid-scale experiments

Cons

  • Exact sample count not specified
  • No license or usage terms detailed
  • Category listed as nlp-text despite multimodal description
Did you find this helpful?

Frequently asked questions

A multimodal dataset from PsiBotAI containing real-world human data across vision, language, and action dimensions with 100,000 to 1 million samples.

User reviews

Verified reviews from the community shape this listing's rating.

Loading reviews…

Sign in to review

Promote SynData

Add this badge to your website, or share the tool.

DFeatured on DhanasviSynData 0