Lumen
VerifiedVision-first browser agent with reliable replay and multi-model support.
What is Lumen?
Lumen functions as a vision-only agent that loops through screenshots, model decisions, and actions to complete browser-based instructions. It normalizes outputs across providers, compresses history to manage context, and maintains persistent state across runs while detecting and recovering from repeated actions.
The agent targets developers building reliable web automation workflows who need flexibility across local Chrome, CDP endpoints, or cloud browsers. It emphasizes safety through domain policies and verification gates while supporting delegation of subtasks and session resumption from saved JSON.
Capabilities
What you can build with Lumen
Web Research Automation
Navigate sites like news aggregators or e-commerce pages to extract current information such as top stories or product prices using only visual cues.
Multi-Step Form Interactions
Handle complex sequences like flight searches by starting at a preloaded URL and completing forms across multiple steps with persistent memory.
Repository Exploration
Visit developer platforms, search for specific projects, and gather details through repeated agent runs within the same session.
Install Lumen
npm install @omxyz/lumenimport { Agent } from "@omxyz/lumen";
const result = await Agent.run({
model: "anthropic/claude-sonnet-4-6",
browser: { type: "local" },
instruction: "Go to news.ycombinator.com and tell me the title of the top story.",
});
console.log(result.result);- 1Install the package via npm in a Node.js 20.19+ environment with Chrome available.
- 2Import the Agent class from @omxyz/lumen in your TypeScript or JavaScript file.
- 3Configure the agent with your chosen model string and local browser settings.
- 4Call Agent.run with an instruction string and optional startUrl or maxSteps parameters.
- 5Process the returned result or use streaming for real-time event handling.
Works with
Lumen: pros & cons
Pros
- +Strong benchmark performance with fast execution times on web tasks.
- +Flexible model support including custom OpenAI-compatible endpoints.
- +Built-in safety features like domain policies and action verification.
- +Efficient history management and session resumption reduce token usage.
Cons
- –Requires a local Chrome instance or paid cloud browser service for operation.
- –Vision-only approach may struggle with text-heavy or dynamic interfaces.
- –Limited to Node.js environments and lacks direct Python bindings.
Frequently asked questions
It works with Anthropic Claude, Google Gemini, OpenAI models, and any OpenAI-compatible endpoint via custom adapters.
User reviews
Verified reviews from the community shape this listing's rating.
Loading reviews…