An AI agent that operates a real browser, using visual rendering and the DOM or accessibility tree, to complete tasks on websites on the user's behalf.
More about Browser Agent
Browser Agents are a category of AI agents that drive a real browser to complete tasks on websites for a user. They take a goal in natural language (for example, "book me the cheapest flight to Cape Town next Tuesday"), open the relevant sites, click through forms, and complete the task as if they were a human user.
To "see" a page, a browser agent typically combines a screenshot for visual layout with the page's DOM and accessibility tree for semantic structure. This is what lets it identify a "Book now" button or fill in the right field of a checkout form. Examples include OpenAI Operator, Anthropic Computer Use, and the open-source browser-use library.
Browser agents are distinct from the more conceptual AI agents entry: they are specifically grounded in browser automation. For website owners, they are also a fast-growing source of non-human traffic. Sites with a clean DOM, a complete accessibility tree, and good AI agent readiness (including WebMCP where appropriate) are easier for browser agents to complete tasks on, which translates into more successful conversions from agent-mediated traffic.