Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.openqa.io/llms.txt

Use this file to discover all available pages before exploring further.

How It Works

When you run npm run test:headed inside your .openqa/ directory, this is what happens:
1

Your BDD step fires

Playwright-BDD or Cucumber.js triggers a step from your .feature file, e.g.:
* I add a new todo item "Buy groceries"
2

runAgent() is called

The generated step definition in .openqa/steps/steps.ts passes the step text to runAgent().
3

An MCP server is created in-process

OpenQA wraps your Playwright browser context in a Playwright MCP server and exposes it over HTTP on a random 127.0.0.1 port. This happens entirely in-process — no separate browser, no subprocess.
4

The AI agent connects and drives the browser

The chosen provider SDK (claudeCode or openCode) connects to that MCP URL and receives your natural language instruction. It uses Playwright MCP tools (browser_navigate, browser_click, browser_type, etc.) to drive the real browser.
5

The step passes or fails

If the agent reports success, the step passes. If a verify tool fails, the agent makes zero tool calls, or the agent narrates a failure in text, the step fails immediately.

Key Design Decisions

Browser sharing — The agent drives the exact same Playwright page object that your test holds. There’s no separate browser window. Shared cookies, session storage, and page state. Parallel-safe — Each runAgent() call creates its own HTTP server on a random port. Parallel test workers get separate ports with no config files or shared state. Session continuity — Within a single test scenario, the agent resumes its conversation session across steps. It remembers the page it navigated to, so each subsequent step doesn’t need to re-explain context. Uniform provider interface — Both claudeCode and openCode implement the same provider.run() interface. Swapping providers is a one-line change. In-process AI SDK — Both providers use their respective SDKs directly in your test process. There are no subprocesses, no temp files, no socket bridges.

Authentication

Claude Code — checks in this order:
MethodHow
claude login sessionBest for local development — no key needed
ANTHROPIC_API_KEYSet in .openqa/.env or shell environment
OpenCode — checks in this order:
MethodHow
opencode auth login sessionBest for local development — no key needed. Supports GitLab Duo, GitHub Copilot, Anthropic, OpenAI, Google, and more.
Provider API keyANTHROPIC_API_KEY, OPENAI_API_KEY, GOOGLE_API_KEY, etc.
OpenCode provider examples:
import { openCode } from 'openqa';

// GitLab Duo (default — use your existing gitlab.com login)
openCode('gitlab/duo-chat-haiku-4-5')

// GitHub Copilot (use your existing GitHub login)
openCode('github-copilot/gpt-5.4')

// Anthropic / OpenAI / Google
openCode('anthropic/claude-haiku-4-5')
openCode('openai/gpt-4o')
openCode('google/gemini-2.0-flash')
The .openqa/.env.example file (generated by openqa init) has the template.

Assertion Detection

Both providers actively detect assertion failures so that tests fail correctly:
  • browser_verify_* tool errors — playwright-mcp returns isError: true on verification failure. Both providers catch this and throw immediately.
  • browser_evaluate / browser_run_code_unsafe throws — evaluated code that throws an error surfaces as a tool error, caught the same way.
  • Text narration fallback — if an agent describes a failure in plain text (e.g. “the test fails because…”) without calling a verify tool, the text is matched and the step is failed.
  • Zero tool calls — if the agent responds without calling any Playwright tool at all, the step fails immediately.