GUI-360
VerifiedDataset of 1.2M+ GUI action trajectories from Windows office apps.
What is GUI-360?
GUI-360 is a collection of multi-modal trajectories recording interactions with common Windows office applications. Each entry pairs screenshots and accessibility data with action sequences and reasoning traces.
It supports development and evaluation of computer-using agents that operate graphical interfaces in productivity software environments.
What you can build with GUI-360
Train office automation agents
Fine-tune multimodal models to execute sequences of actions in Word, Excel, and PowerPoint using the 1.2M action steps and reasoning traces.
Benchmark computer-using agents
Evaluate GUI agents on full-resolution screenshots paired with accessibility metadata to measure task completion in productivity software.
Develop accessibility-aware interfaces
Build models that leverage accessibility trees and screenshots to improve interaction prediction for screen-reader or automation tools.
Load GUI-360
from datasets import load_dataset
ds = load_dataset("vyokky/GUI-360")- 1pip install datasets
- 2from datasets import load_dataset
- 3dataset = load_dataset('vyokky/gui-360')
- 4Inspect dataset features for screenshots, actions, and traces
- 5Split into train/eval and integrate into your training loop
GUI-360: pros & cons
Pros
- +Over 1.2 million executed steps across thousands of trajectories
- +Full-resolution screenshots with accessibility metadata
- +Multi-modal reasoning traces included
- +Ready for image-text-to-text agent benchmarks
Cons
- –Restricted to Microsoft Office applications only
- –Large size may require substantial storage and compute
- –No coverage of non-Windows or non-Office GUI environments
Frequently asked questions
A dataset of 1.2M+ GUI action steps in Word, Excel, and PowerPoint with screenshots, accessibility data, and reasoning traces for training computer-using agents.
User reviews
Verified reviews from the community shape this listing's rating.
Loading reviews…