Skip to content
OSWorld logo

OSWorld

Verified

Open-source benchmark for testing AI agents on real desktop OS tasks.

Autonomous AgentsGeneral-Purpose 2.9kOpen source
View on GitHub
Updated 2026-06-15
OSWorld GitHub repository

What is OSWorld?

OSWorld is an open-source platform that creates controlled virtual desktop environments for benchmarking AI agents. It supports multiple virtualization backends including VMware, VirtualBox, Docker, and cloud options like AWS to run realistic operating system interactions.

Agents interact with the desktop through screenshots, mouse, and keyboard actions while completing tasks such as file management, web browsing, and application use. The framework automatically resets states, records trajectories, and scores outcomes against ground-truth criteria.

It targets AI researchers and developers building computer-use agents who need reproducible, scalable evaluations beyond simple API benchmarks.

What you can build with OSWorld

Model Evaluation

Run standardized tests to compare how well different multimodal models perform on everyday desktop workflows.

Agent Development

Iterate on agent architectures by measuring success rates across hundreds of tasks in a consistent VM setup.

Research Benchmarking

Publish results against the official OSWorld leaderboard using verified task sets and parallel evaluation.

Install OSWorld

Install
pip install desktop-env
Quick start
# Clone the OSWorld repository
git clone https://github.com/xlang-ai/OSWorld

# Change directory into the cloned repository
cd OSWorld

# Optional: Create a Conda environment for OSWorld
# conda create -n osworld python=3.10
# conda activate osworld

# Install required dependencies
pip install -r requirements.txt
  1. 1Clone the OSWorld GitHub repository and create a Python 3.10+ environment.
  2. 2Install dependencies with pip install -r requirements.txt or the lighter desktop-env package.
  3. 3Install and configure VMware Workstation Pro or VirtualBox and verify the vmrun command works.
  4. 4Run the setup script to automatically download the required virtual machine images.
  5. 5Launch evaluations using the provided scripts and review results on the data viewer.

OSWorld: pros & cons

Pros

  • +Broad platform support including local VMs and cloud instances
  • +Large verified task set with automatic scoring
  • +Active maintenance and community updates
  • +Open-source with Apache 2.0 license

Cons

  • Initial VM setup requires technical configuration
  • Resource-heavy due to full desktop virtualization
  • Parallel runs limited on consumer hardware
Did you find this helpful?

Frequently asked questions

It works with VMware, VirtualBox, Docker with KVM, and cloud platforms such as AWS and Azure.

User reviews

Verified reviews from the community shape this listing's rating.

Loading reviews…

Sign in to review

Promote OSWorld

Add this badge to your website, or share the tool.

DFeatured on DhanasviOSWorld 0