BrowserGym
VerifiedGym-style toolkit to benchmark browser-based web agents.
What is BrowserGym?
BrowserGym is an open-source gym-style toolkit built specifically for assessing web agents that operate inside browsers. It provides the core structure needed to run controlled evaluations of how agents navigate and complete tasks on websites.
The environment follows the familiar gym interface, allowing agents to receive observations from the browser, take actions such as clicks or form inputs, and receive feedback on task outcomes. This setup supports repeatable testing without manual browser management.
It targets researchers and engineers developing or comparing AI agents for web-based activities, giving them a common platform to measure progress and identify strengths in different approaches.
What you can build with BrowserGym
Agent Benchmarking
Run standardized tests to measure how well different web agents perform on browser tasks.
Performance Comparisons
Compare multiple agent designs side by side using the same evaluation environment.
Development Iteration
Test changes to an agent within a consistent gym framework to track improvements.
Install BrowserGym
pip install browsergym- 1Clone the open-source BrowserGym repository
- 2Install dependencies listed in the project files
- 3Configure a compatible browser instance for the environment
- 4Implement or load an agent that follows the gym action-observation loop
- 5Execute evaluation scripts and review the resulting metrics
BrowserGym: pros & cons
Pros
- +Familiar gym interface reduces learning curve for reinforcement learning users
- +Open-source nature allows community inspection and contributions
- +Focused specifically on browser web agents for targeted evaluation
- +Supports consistent, repeatable benchmarking across experiments
Cons
- –Requires additional browser setup which can add complexity
- –Scope limited to web browser interactions only
- –May need custom task definitions for specialized use cases
Frequently asked questions
Yes, it is provided as an open-source project.
User reviews
Verified reviews from the community shape this listing's rating.
Loading reviews…