Is SpreadsheetBench free?

Yes, it is publicly available on the Hugging Face Hub at no cost.

How do I access the dataset?

Load it directly with the Hugging Face datasets library using load_dataset('kaka22/SpreadsheetBench').

Check the dataset repository on Hugging Face for the exact license and usage terms.

SpreadsheetBench

Real-world spreadsheet manipulation benchmark with 912 authentic questions for LLMs.

DatasetAI & Machine Learning↓ 144K/moFree

Open dataset

Updated 2026-06-16

What is SpreadsheetBench?

SpreadsheetBench is a benchmark of 912 real questions for spreadsheet manipulation, collected exclusively from practical user workflows and paired with corresponding spreadsheet files.

It supports evaluation of large language models on authentic spreadsheet tasks and is intended for researchers comparing model performance against existing synthesized benchmarks.

What you can build with SpreadsheetBench

LLM spreadsheet agent evaluation

Test how well a new LLM or agent handles real user queries like formula creation, data filtering, and chart generation on authentic .xlsx files.

Fine-tuning for office automation

Use the 912 question-file pairs as supervised data to fine-tune models that output correct spreadsheet operations or Python code for pandas/openpyxl.

Comparative benchmarking

Run standardized evaluations against other LLMs to measure progress on practical spreadsheet manipulation without relying on synthetic test sets.

Load SpreadsheetBench

Python

from datasets import load_dataset

ds = load_dataset("KAKA22/SpreadsheetBench")

1pip install datasets
2from datasets import load_dataset
3dataset = load_dataset('kaka22/SpreadsheetBench')
4Access 'questions' and linked spreadsheet files in the dataset splits
5Run your model on each query and compare outputs against ground-truth actions