mtw-e2e-runner

fastslack/mtw-e2e-runner
Community

Install to Claude Code

This server doesn't publish a one-line install command. Follow the setup in the source repository.

Summary

JSON-driven E2E test runner with parallel execution against a Chrome pool.

README.md

<p align="right"> <strong>English</strong> Β· <a href="LEEME.md">EspaΓ±ol</a> </p>

<h1 align="center">@matware/e2e-runner</h1>

<p align="center"> <strong>The AI-native E2E test runner that writes, runs, and debugs tests for you.</strong> </p>

<p align="center"> <a href="https://www.npmjs.com/package/@matware/e2e-runner"><img src="https://img.shields.io/npm/v/@matware/e2e-runner?color=blue" alt="npm version" /></a> <img src="https://img.shields.io/node/v/@matware/e2e-runner" alt="node version" /> <a href="https://www.npmjs.com/package/@matware/e2e-runner"><img src="https://img.shields.io/npm/dm/@matware/e2e-runner" alt="npm downloads" /></a> <a href="https://hub.docker.com/r/fastslack/e2e-runner-mcp"><img src="https://img.shields.io/docker/pulls/fastslack/e2e-runner-mcp" alt="Docker pulls" /></a> <a href="https://github.com/fastslack/mtw-e2e-runner/stargazers"><img src="https://img.shields.io/github/stars/fastslack/mtw-e2e-runner" alt="GitHub stars" /></a> <a href="LICENSE"><img src="https://img.shields.io/npm/l/@matware/e2e-runner" alt="license" /></a> <img src="https://img.shields.io/badge/MCP-compatible-green" alt="MCP compatible" /> <img src="https://img.shields.io/badge/AI--native-Claude%20Code-blueviolet" alt="AI native" /> <img src="https://img.shields.io/badge/AI--native-OpenCode-orange" alt="OpenCode compatible" /> <a href="https://skills.sh"><img src="https://img.shields.io/badge/skills.sh-e2e--testing-ff6600" alt="Agent Skills" /></a> </p>

---

E2E Runner lets you test your web app without writing test code. Tests are plain JSON β€” and you don't even have to write that yourself: just ask Claude Code.

🎬 Write a test by asking β€” then watch it run

<p align="center"> <img src="https://raw.githubusercontent.com/fastslack/mtw-e2e-runner/main/docs/screenshots/demo-live.gif" alt="Live dashboard streaming screenshots as a test suite runs" width="820" /> <br/><sub><em>The live dashboard while a suite runs β€” every step streams a screenshot into the feed, in real time.</em></sub> </p>

With the built-in MCP server, creating a test is a conversation β€” no docs, no syntax to memorize:

You: Create an E2E test for the login flow and run it. Claude Code: writes the test, runs it in a real browser, and reports back β€” βœ… login-flow passed in 2.3s Β· screenshot saved Β· no network errors.

Behind the scenes Claude just wrote and ran this. A test is just JSON β€” an ordered list of what a user does:

[
  { "name": "login-flow", "actions": [
    { "type": "goto", "value": "/login" },
    { "type": "type", "selector": "#email", "value": "user@test.com" },
    { "type": "type", "selector": "#password", "value": "secret" },
    { "type": "click", "text": "Sign In" },
    { "type": "assert_text", "text": "Welcome back" },
    { "type": "screenshot", "value": "logged-in.png" }
  ]}
]

No imports, no describe/it, no build step. If you can read it you can write it β€” or just ask.

Connect it to Claude Code (2 commands):

claude plugin marketplace add fastslack/mtw-e2e-runner
claude plugin install e2e-runner@matware

Now say "create a test for X and run it" β€” Claude gets 17 MCP tools, slash commands, and specialized agents.

Using a different agent (Cursor, Codex, Copilot, 40+ more)? Install the skill: npx skills add fastslack/mtw-e2e-runner

---

πŸ“– Contents

| | Section | What's inside | |---|---------|---------------| | πŸš€ | Install &amp; first test | npm setup Β· run with your own Chrome (no Docker), Obscura, or a Docker pool | | ✨ | What you get | feature overview at a glance | | ✍️ | Writing tests | test format Β· full action catalog Β· retries Β· serial Β· modules Β· auth Β· hooks | | πŸ€– | AI integration | Claude Code Β· OpenCode Β· 17 MCP tools Β· visual verification Β· issue-to-test | | πŸ“Š | Dashboard &amp; insights | live dashboard Β· learning system Β· network logs Β· screenshot capture | | 🌐 | Browser drivers | browserless Β· cdp Β· lightpanda Β· obscura Β· steel | | βš™οΈ | CLI, config &amp; CI | commands Β· flags Β· e2e.config.js Β· GitHub Actions Β· programmatic API |

---

<a name="install"></a>

πŸš€ Install β€” it's tiny

npm install --save-dev @matware/e2e-runner
npx e2e-runner init        # scaffolds e2e/ with a sample test + config

Then pick how to run the browser. You don't need Docker unless you want the parallel pool:

Option 1 Β· Use the Chrome you already have β€” no Docker ⭐

Launch any Chromium browser with a debugging port, then point the runner at it:

google-chrome --headless=new --remote-debugging-port=9222 &   # or brave / chromium / msedge
CHROME_POOL_URL=http://localhost:9222 POOL_DRIVER=cdp npx e2e-runner run --all

Or bake it into e2e.config.js so you never repeat it:

export default {
  baseUrl: 'http://localhost:3000',     // your app β€” plain localhost, no docker hostname
  poolUrls: ['http://localhost:9222'],
  poolDriver: 'cdp',
};

Nothing to install beyond npm, and baseUrl is just localhost (the browser is on your machine).

Option 2 Β· Obscura β€” one tiny binary, no Docker

A single ~30 MB binary with built-in anti-detection. Install once, run it, point the runner at it:

obscura serve --port 9222 --stealth &
CHROME_POOL_URL=http://localhost:9222 POOL_DRIVER=obscura npx e2e-runner run --all

npx e2e-runner pool start (with poolDriver: 'obscura' in your config) prints the exact install command for your OS.

Option 3 Β· Docker pool β€” parallel, for CI &amp; big suites

A shared, queue-managed Chrome pool that runs many tests at once:

npx e2e-runner run --all     # the first run auto-starts the Docker pool for you

Requires Docker. Set baseUrl: 'http://host.docker.internal:3000' so the containerized Chrome can reach your app.

<details> <summary><strong>Why <code>host.docker.internal</code> (Docker option only)?</strong></summary>

<br/>

With the Docker pool, Chrome runs inside a container, so localhost there means the container β€” not your machine. host.docker.internal bridges to your host. On Linux (Docker Engine, not Docker Desktop) add --add-host=host.docker.internal:host-gateway, or use your LAN IP. Options 1 &amp; 2 don't have this β€” the browser is local, so plain localhost just works.

</details>

Write your first test

Open e2e/tests/sample.json β€” a flow is an ordered list of actions:

[
  { "name": "homepage loads", "actions": [
    { "type": "goto", "value": "/" },
    { "type": "assert_text", "text": "Welcome" },
    { "type": "screenshot", "value": "home.png" }
  ]}
]

Run it with npx e2e-runner run --all. Results β€” pass/fail, timing, screenshots, network errors β€” print to your terminal and to the web dashboard if it's open.

<details> <summary><strong>Add OpenCode</strong> (optional)</summary>

<br/>

cp node_modules/@matware/e2e-runner/opencode.json ./
mkdir -p .opencode && cp -r node_modules/@matware/e2e-runner/.opencode/* .opencode/

See OPENCODE.md for details.

</details>

Updating

Each install method updates separately β€” bump the one(s) you use:

# npm dependency (per project)
npm install --save-dev @matware/e2e-runner@latest

# Claude Code plugin
claude plugin update e2e-runner@matware

# MCP-only install (npx caches the package β€” pin @latest to force a refresh)
claude mcp add --transport stdio --scope user e2e-runner \
  -- npx -y -p @matware/e2e-runner@latest e2e-runner-mcp

[!NOTE] Two gotchas: (1) npx prefers a copy found in the project's node_modules over its own cache β€” if a project pins an old version, the MCP server and dashboard run that old version, so update the project dependency too. (2) Already-running processes keep the old code in memory: after updating, restart the dashboard and reconnect the MCP server (/mcp β†’ e2e-runner β†’ Reconnect, or restart your session).

---

<a name="features"></a>

✨ What you get

πŸ§ͺ Zero-code tests β€” JSON files that anyone on your team can read and write. No JavaScript, no compilation, no framework lock-in.

πŸ€– AI-powered testing β€” Claude Code creates, executes, and debugs tests natively through 17 MCP tools. Ask it to "test the checkout flow" and it builds the JSON, runs it, and reports back.

πŸ› Issue-to-Test pipeline β€” Paste a GitHub or GitLab issue URL. The runner fetches it, generates E2E tests, runs them, and tells you: bug confirmed or not reproducible.

πŸ‘οΈ Visual verification β€” Describe what the page should look like in plain English. The AI captures a screenshot and judges pass/fail against your description. No pixel-diffing setup needed.

🧠 Learning system β€” Tracks test stability across runs. Detects flaky tests, unstable selectors, slow APIs, and error patterns β€” then surfaces actionable insights.

⚑ Parallel execution β€” Run N tests simultaneously against a shared browser pool (browserless, raw CDP, Lightpanda, Obscura, or Steel). Serial mode available for tests that share state.

🎯 Pluggable browser drivers β€” Pick the engine that fits each test: real Chrome via browserless, Lightpanda or Obscura for fast lightweight runs, Steel for managed sessions. Set driver per test or override the whole run with --driver.

πŸ“Š Real-time dashboard β€” Live execution view, run history with pass-rate charts, screenshot gallery with hash-based search, expandable network request logs.

πŸ” Smart retries β€” Test-level and action-level retries with configurable delays. Flaky tests are detected and flagged automatically.

πŸ“¦ Reusable modules β€” Extract common flows (login, navigation, setup) into parameterized modules and reference them with $use.

πŸ—οΈ CI-ready β€” JUnit XML output, exit code 1 on failure, auto-captured error screenshots. Drop-in GitHub Actions example included.

🌐 Multi-project β€” One dashboard aggregates test results from all your projects. One Chrome pool serves them all.

🐳 Portable β€” Chrome runs in Docker, tests are JSON files in your repo. Works on any machine with Node.js and Docker.

---

<a name="writing-tests"></a>

✍️ Writing tests

Everything about authoring tests β€” the file format, the full action vocabulary, retries, state isolation, and reuse. Expand what you need:

<details> <summary><strong>Test format &amp; file layout</strong></summary>

<br/>

Each .json file in e2e/tests/ contains an array of tests. Each test has a name and sequential actions:

[
  {
    "name": "homepage-loads",
    "actions": [
      { "type": "goto", "value": "/" },
      { "type": "assert_visible", "selector": "body" },
      { "type": "assert_url", "value": "/" },
      { "type": "screenshot", "value": "homepage.png" }
    ]
  }
]

Suite files can have numeric prefixes for ordering (01-auth.json, 02-dashboard.json). The --suite flag matches with or without the prefix, so --suite auth finds 01-auth.json.

</details>

<details> <summary><strong>Action catalog</strong> β€” navigation, input &amp; interaction</summary>

<br/>

| Action | Fields | Description | |--------|--------|-------------| | goto | value | Navigate to URL (relative to baseUrl or absolute) | | click | selector or text | Click by CSS selector or visible text content. Text mode also takes scope: "dialog", visible: true, last: true | | type / fill | selector, value | Clear field and type text | | wait | selector, text, gone, or value (ms) | Wait for element/text to appear, for gone to disappear (spinner/dialog), or fixed delay. Prefer conditions over fixed value sleeps | | screenshot | value (filename) | Capture a screenshot | | select | selector, value | Select a dropdown option | | clear | selector | Clear an input field | | press | value | Press a keyboard key (Enter, Tab, etc.) | | scroll | selector or value (px) | Scroll to element or by pixel amount | | hover | selector | Hover over an element | | evaluate | value | Execute JavaScript in the browser context | | navigate | value | Browser navigation (back, forward, reload) | | clear_cookies | β€” | Clear all cookies for the current page | | wait_network_idle | optional value (idle ms, default 500), timeout | Wait until the network has been idle for value ms β€” useful after actions that trigger background requests | | set_storage | value ("key=val"), optional selector: "session" | Set a localStorage key (or sessionStorage with selector: "session") | | gql | value (query), optional text (variables JSON), optional selector (assertion) | Run a GraphQL query/mutation via in-page fetch, with the auth token read from localStorage. Fails on GraphQL errors. selector is a JS expression asserted against the response r (e.g. "r.data.users.length > 0"). Installs window.__e2eGql for later evaluate steps |

Click by text β€” when click uses text instead of selector, it searches across common interactive and content elements:

button, a, [role="button"], [role="tab"], [role="menuitem"], [role="option"],
[role="listitem"], div[class*="cursor"], span, li, td, th, label, p, h1-h6
{ "type": "click", "text": "Sign In" }

</details>

<details> <summary><strong>Assertions</strong> β€” verify text, elements, URLs, counts &amp; network</summary>

<br/>

| Action | Fields | Description | |--------|--------|-------------| | assert_text | text | Assert text exists anywhere on the page (substring) | | assert_no_text | text | Assert text does NOT appear anywhere on the page β€” opposite of assert_text | | assert_text_in | selector, text, optional value: "exact" | Assert text inside a scoped container. text is a case-insensitive regex by default; value: "exact" switches to case-sensitive substring | | assert_element_text | selector, text, optional value: "exact" | Assert element's text contains (or exactly matches) the expected text | | assert_url | value | Assert current URL path or full URL. Paths (/dashboard) compare against pathname only | | assert_visible | selector | Assert element exists and is visible | | assert_not_visible | selector | Assert element is hidden or doesn't exist | | assert_attribute | selector, value | Check attribute: "type=email" for value, "disabled" for existence | | assert_class | selector, value | Assert element has a CSS class | | assert_input_value | selector, value | Assert input/select/textarea .value contains text | | assert_matches | selector, value (regex) | Assert element text matches a regex pattern | | assert_count | selector, value | Assert element count: exact ("5"), or operators (">3", ">=1", "<10") | | assert_no_network_errors | β€” | Fail if any network requests failed (e.g. ERR_CONNECTION_REFUSED) | | assert_storage | value ("key" or "key=expected"), optional selector: "session" | Assert a localStorage/sessionStorage key exists or has a specific value | | assert_visual | value (golden image), optional selector, text (max diff, e.g. "0.02"), fullPage, maskRegions, threshold | Visual regression: compare a screenshot against a golden reference. The first run saves the golden; later runs fail if more pixels differ than the threshold (default 2%) and write a diff image | | get_text | selector | Extract element text (non-assertion, never fails). Result: { value: "..." } |

</details>

<details> <summary><strong>Framework-aware actions</strong> β€” React/MUI without <code>evaluate</code> boilerplate</summary>

<br/>

These actions handle common patterns in React/MUI apps that normally require verbose evaluate boilerplate:

| Action | Fields | Description | |--------|--------|-------------| | type_react | selector, value, optional blur, waitAfter | Type into React controlled inputs using the native value setter. Dispatches input + change events so React state updates correctly. blur: true commits on blur; waitAfter: "<ms>" waits after (debounced autocomplete). | | click_regex | text (regex), optional selector, optional value: "last" | Click element whose textContent matches a regex (case-insensitive). Default: first match. Use value: "last" for last match. | | click_option | text | Click a [role="option"] element by text β€” common in autocomplete/select dropdowns. | | select_combobox | text, optional selector, filter, openWait/filterWait/waitAfter | Open a MUI Autocomplete/Select, optionally type filter, then click the option matching text. Falls back across [role="option"], .MuiAutocomplete-option, li.MuiMenuItem-root. | | focus_autocomplete | text (label text) | Focus an autocomplete input by its label text. Supports MUI and generic [role="combobox"]. | | click_chip | text | Click a chip/tag element by text. Searches [class="Chip"], [class="chip"], [data-chip]. | | click_icon | value (icon id), optional selector (scope) | Click an icon by data-testid/data-icon/aria-label/class fragment or SVG <title> β€” MUI, FontAwesome, Heroicons, etc. Clicks the nearest clickable ancestor (button, link, tab). | | click_menu_item | text, optional selector (scope) | Click a menu item by text across [role="menuitem"], .dropdown-item, .menu-item, MUI MenuItem. | | click_in_context | text (container text), selector (child) | Click a child element inside the smallest container matching text β€” e.g. the delete button of one specific card/row. |

// Before: 5 lines of evaluate boilerplate
{ "type": "evaluate", "value": "const input = document.querySelector('#search'); const nativeSet = Object.getOwnPropertyDescriptor(window.HTMLInputElement.prototype, 'value').set; nativeSet.call(input, 'term'); input.dispatchEvent(new Event('input', {bubbles: true})); input.dispatchEvent(new Event('change', {bubbles: true}));" }

// After: 1 action
{ "type": "type_react", "selector": "#search", "value": "term" }

</details>

<details> <summary><strong>Multi-tab actions</strong> β€” popups, OAuth windows &amp; cross-tab flows</summary>

<br/>

| Action | Fields | Description | |--------|--------|-------------| | open_tab | value (URL), optional text (label) | Open a new tab and navigate to the URL (relative to baseUrl or absolute). Label defaults to tab-<n> | | switch_tab | value | Switch the active tab by label, numeric index, or title/URL match (regex or substring). "default" returns to the original tab | | wait_for_tab | optional text (label), timeout | Wait for a new tab/popup opened by the app (window.open, target="_blank") and make it the active tab | | assert_tab_count | value | Assert the number of open tabs: exact ("2") or operators (">=2") | | close_tab | optional value (label) | Close the current (or named) tab and switch back to the last remaining one |

All subsequent actions run in the active tab:

{ "type": "click", "text": "Open report" }
{ "type": "wait_for_tab", "text": "report" }
{ "type": "assert_text", "text": "Quarterly results" }
{ "type": "close_tab" }

</details>

<details> <summary><strong>Retries &amp; flaky detection</strong></summary>

<br/>

Test-level retry β€” retry an entire test on failure. Set globally via config or per-test:

{ "name": "flaky-test", "retries": 3, "timeout": 15000, "actions": [...] }

Tests that pass after retry are flagged as flaky in the report and learning system.

Action-level retry β€” retry a single action without rerunning the entire test. Useful for timing-sensitive clicks and waits:

{ "type": "click", "selector": "#dynamic-btn", "retries": 3 }
{ "type": "wait", "selector": ".lazy-loaded", "retries": 2 }

Set globally: actionRetries in config, --action-retries <n> CLI, or ACTION_RETRIES env var. Delay between retries: actionRetryDelay (default 500ms).

</details>

<details> <summary><strong>Serial tests</strong> β€” for tests that share state</summary>

<br/>

Tests that share state (e.g., two tests modifying the same record) can race when running in parallel. Mark them as serial:

{ "name": "create-patient", "serial": true, "actions": [...] }
{ "name": "verify-patient-list", "serial": true, "actions": [...] }

Serial tests run one at a time after all parallel tests finish β€” preventing interference without slowing down independent tests.

</details>

<details> <summary><strong>Testing authenticated apps</strong></summary>

<br/>

The simplest approach β€” log in via the UI like a real user:

{
  "hooks": {
    "beforeEach": [
      { "type": "goto", "value": "/login" },
      { "type": "type", "selector": "#email", "value": "test@example.com" },
      { "type": "type", "selector": "#password", "value": "test-password" },
      { "type": "click", "text": "Sign In" },
      { "type": "wait", "selector": ".dashboard" }
    ]
  },
  "tests": [...]
}

For SPAs with JWT, skip the login form by injecting the token directly:

{ "type": "set_storage", "value": "accessToken=eyJhbGciOiJIUzI1NiIs..." }

Or set it globally in config:

// e2e.config.js
export default {
  authToken: 'eyJhbGciOiJIUzI1NiIs...',
  authStorageKey: 'accessToken',
};

Each test runs in a fresh browser context, so auth state is automatically clean between tests.

More strategies: Cookie-based auth, HTTP header injection, OAuth/SSO bypasses, reusable auth modules, and role-based testing β€” see docs/authentication.md

</details>

<details> <summary><strong>Reusable modules</strong> β€” extract common flows with <code>$use</code></summary>

<br/>

Extract common flows into parameterized modules:

// e2e/modules/login.json
{
  "$module": "login",
  "description": "Log in via the UI login form",
  "params": {
    "email": { "required": true, "description": "User email" },
    "password": { "required": true, "description": "User password" }
  },
  "actions": [
    { "type": "goto", "value": "/login" },
    { "type": "type", "selector": "#email", "value": "{{email}}" },
    { "type": "type", "selector": "#password", "value": "{{password}}" },
    { "type": "click", "text": "Sign In" },
    { "type": "wait", "value": "2000" }
  ]
}

Use in tests:

{
  "name": "dashboard-loads",
  "actions": [
    { "$use": "login", "params": { "email": "user@test.com", "password": "secret" } },
    { "type": "assert_text", "text": "Dashboard" }
  ]
}

Modules support parameter validation (required params fail fast), conditional blocks ({{#param}}...{{/param}}), nested composition, and cycle detection.

</details>

<details> <summary><strong>Hooks</strong> β€” beforeAll / beforeEach / afterEach / afterAll</summary>

<br/>

Run actions at lifecycle points. Define globally in config or per-suite:

{
  "hooks": {
    "beforeAll": [{ "type": "goto", "value": "/setup" }],
    "beforeEach": [{ "type": "goto", "value": "/" }],
    "afterEach": [{ "type": "screenshot", "value": "after.png" }],
    "afterAll": []
  },
  "tests": [...]
}

Important: beforeAll runs on a separate browser page that is closed before tests start. Use beforeEach for state that tests need (cookies, localStorage, auth tokens).

</details>

<details> <summary><strong>Exclude patterns</strong> β€” skip drafts from <code>--all</code></summary>

<br/>

Skip exploratory or draft tests from --all runs:

// e2e.config.js
export default {
  exclude: ['explore-*', 'debug-*', 'draft-*'],
};

Individual suite runs (--suite) are not affected by exclude patterns.

</details>

---

<a name="ai"></a>

πŸ€– AI integration

The whole point: your agent writes, runs, and verifies tests for you.

<details> <summary><strong>Claude Code</strong> β€” plugin install &amp; MCP-only install</summary>

<br/>

claude plugin marketplace add fastslack/mtw-e2e-runner
claude plugin install e2e-runner@matware

This gives Claude 17 MCP tools, a workflow skill, 4 slash commands (/e2e-runner:run, /e2e-runner:create-test, /e2e-runner:verify-issue, /e2e-runner:capture), and 3 specialized agents (test-analyzer, test-creator, test-improver).

MCP-only install (tools only, no skill/commands/agents):

claude mcp add --transport stdio --scope user e2e-runner \
  -- npx -y -p @matware/e2e-runner e2e-runner-mcp

</details>

<details> <summary><strong>OpenCode</strong></summary>

<br/>

cp node_modules/@matware/e2e-runner/opencode.json ./
mkdir -p .opencode && cp -r node_modules/@matware/e2e-runner/.opencode/* .opencode/

See OPENCODE.md for details.

</details>

<details> <summary><strong>The 17 MCP tools</strong></summary>

<br/>

| Tool | Description | |------|-------------| | e2e_run | Run tests (all, by suite, or by file) | | e2e_list | List available test suites | | e2e_create_test | Create a new test JSON file | | e2e_create_module | Create a reusable module | | e2e_pool_status | Check Chrome pool health | | e2e_app_pool_status | Inspect the app environment pool (forks, ports, drivers) | | e2e_screenshot | Retrieve a screenshot by hash | | e2e_capture | Capture screenshot of any URL | | e2e_analyze | Extract page structure (interactive elements, forms, headings) and emit test scaffolds | | e2e_dashboard_start | Start web dashboard | | e2e_dashboard_stop | Stop web dashboard | | e2e_dashboard_restart | Restart the dashboard (new project dir/port, clear stale sessions) | | e2e_issue | Fetch issue and generate tests | | e2e_network_logs | Query network logs for a run | | e2e_learnings | Query stability insights | | e2e_vars | Manage SQLite-backed {{var.KEY}} project variables | | e2e_neo4j | Manage Neo4j knowledge graph |

Pool start/stop are CLI-only β€” not exposed via MCP.

</details>

<details> <summary><strong>Visual verification</strong> β€” describe the page, AI judges it</summary>

<br/>

Describe what the page should look like β€” AI judges pass/fail from screenshots:

{
  "name": "dashboard-loads",
  "expect": "Patient list with at least 3 rows, no error messages, sidebar with navigation links",
  "actions": [
    { "type": "goto", "value": "/dashboard" },
    { "type": "wait", "selector": ".patient-list" }
  ]
}

After test actions complete, the runner auto-captures a verification screenshot. The MCP response includes the screenshot hash β€” Claude Code retrieves it and visually verifies against your expect description. No API key required.

</details>

<details> <summary><strong>Issue-to-test</strong> β€” turn a bug report into a runnable test</summary>

<br/>

Turn GitHub and GitLab issues into executable E2E tests. Paste an issue URL and get runnable tests β€” automatically.

How it works:

  1. Fetch β€” Pulls issue details (title, body, labels) via gh or glab CLI
  2. Generate β€” AI creates JSON test actions based on the issue description
  3. Run β€” Optionally executes the tests immediately to verify if a bug is reproducible
# Fetch and display
e2e-runner issue https://github.com/owner/repo/issues/42

# Generate a test file via Claude API
e2e-runner issue https://github.com/owner/repo/issues/42 --generate

# Generate + run + report
e2e-runner issue https://github.com/owner/repo/issues/42 --verify
# -> "BUG CONFIRMED" or "NOT REPRODUCIBLE"

In Claude Code, just ask: > "Fetch issue #42 and create E2E tests for it"

Bug verification logic: Generated tests assert the correct behavior. Test failure = bug confirmed. All tests pass = not reproducible.

Auth: GitHub requires gh CLI, GitLab requires glab CLI. Self-hosted GitLab is supported.

</details>

---

<a name="dashboard"></a>

πŸ“Š Dashboard &amp; insights

e2e-runner dashboard                  # Start on default port 8484
e2e-runner dashboard --port 9090      # Custom port

<details> <summary><strong>Web dashboard tour</strong> β€” live view, history, gallery, pool status</summary>

<br/>

Live execution β€” monitor tests in real-time with step-by-step progress, durations, and active worker count.

<p align="center"> <img src="https://raw.githubusercontent.com/fastslack/mtw-e2e-runner/main/docs/screenshots/blog-dashboard-live-running.png" alt="Dashboard - Live test execution" width="800" /> </p>

Test suites β€” browse all suites across projects. Run a single suite or all tests with one click.

<p align="center"> <img src="https://raw.githubusercontent.com/fastslack/mtw-e2e-runner/main/docs/screenshots/blog-dashboard-suites.png" alt="Dashboard - Test suites grid" width="800" /> </p>

Run history β€” track pass-rate trends with the built-in chart. Click any row to expand full detail.

<p align="center"> <img src="https://raw.githubusercontent.com/fastslack/mtw-e2e-runner/main/docs/screenshots/blog-dashboard-runs.png" alt="Dashboard - Run history" width="800" /> </p>

Run detail β€” PASS/FAIL badges, screenshot thumbnails with copyable hashes (ss:77c28b5a), formatted console errors, and network request logs.

<p align="center"> <img src="https://raw.githubusercontent.com/fastslack/mtw-e2e-runner/main/docs/screenshots/blog-dashboard-run-detail.png" alt="Dashboard - Run detail" width="800" /> </p>

Screenshot gallery β€” browse all captured screenshots with hash search (action, error, and verification captures).

<p align="center"> <img src="https://raw.githubusercontent.com/fastslack/mtw-e2e-runner/main/docs/screenshots/blog-dashboard-screenshots-gallery.png" alt="Dashboard - Screenshot gallery" width="800" /> </p>

Pool status β€” Chrome pool health: available slots, running sessions, memory pressure.

<p align="center"> <img src="https://raw.githubusercontent.com/fastslack/mtw-e2e-runner/main/docs/screenshots/blog-dashboard-pool-status.png" alt="Dashboard - Pool status" width="800" /> </p>

</details>

<details> <summary><strong>Learning system</strong> β€” flaky tests, unstable selectors, slow APIs</summary>

<br/>

The runner learns from every test run β€” building knowledge about your test suite over time. Query insights via the e2e_learnings MCP tool:

| Query | Returns | |-------|---------| | summary | Full health overview: pass rate, flaky tests, unstable selectors, API issues | | flaky | Tests that pass only after retries | | selectors | CSS selectors with high failure rates | | pages | Pages with console errors, network failures, load time issues | | apis | API endpoints with error rates and latency (auto-normalized: UUIDs, hashes, IDs) | | errors | Most frequent error patterns, categorized | | trends | Pass rate over time (auto-switches to hourly when all data is from one day) | | test:<name> | Drill-down history for a specific test | | page:<path> | Drill-down history for a specific page | | selector:<value> | Drill-down history for a specific selector |

Storage &amp; export:

  • SQLite (~/.e2e-runner/dashboard.db) β€” default, zero setup
  • Neo4j knowledge graph β€” optional, for relationship-based analysis. Manage via e2e_neo4j MCP tool or docker compose
  • Markdown report (e2e/learnings.md) β€” auto-generated after each run

Test narration: Each test run generates a human-readable narrative of what happened step by step, visible in the CLI output and the dashboard.

</details>

<details> <summary><strong>Network error handling</strong> β€” assertions, global flag, full logging</summary>

<br/>

Explicit assertion β€” place assert_no_network_errors after critical page loads:

{ "type": "goto", "value": "/dashboard" },
{ "type": "wait", "selector": ".loaded" },
{ "type": "assert_no_network_errors" }

Global flag β€” set failOnNetworkError: true to automatically fail any test with network errors:

e2e-runner run --all --fail-on-network-error

When disabled (default), the runner still collects and reports network errors β€” the MCP response includes a warning when tests pass but have network errors.

Full network logging β€” all XHR/fetch requests are captured with URL, method, status, duration, request/response headers, and response body (truncated at 50KB). Viewable in the dashboard with expandable request detail rows.

MCP drill-down flow:

1. e2e_run          β†’ compact networkSummary + runDbId
2. e2e_network_logs(runDbId)                     β†’ all requests (url, method, status, duration)
3. e2e_network_logs(runDbId, errorsOnly: true)   β†’ only failed requests
4. e2e_network_logs(runDbId, includeHeaders: true) β†’ with headers
5. e2e_network_logs(runDbId, includeBodies: true)  β†’ full request/response bodies

The e2e_run response stays compact (~5KB) regardless of how many requests were captured. Use e2e_network_logs with the returned runDbId to drill into details on demand.

</details>

<details> <summary><strong>Screenshot capture</strong> β€” snapshot any URL on demand</summary>

<br/>

Capture screenshots of any URL on demand β€” no test suite required:

e2e-runner capture https://example.com
e2e-runner capture https://example.com --full-page --selector ".loaded" --delay 2000

Via MCP, the e2e_capture tool supports authToken and authStorageKey for authenticated pages β€” it injects the token into localStorage before navigating.

Every screenshot gets a deterministic hash (ss:a3f2b1c9). Use e2e_screenshot to retrieve any screenshot by hash β€” it returns the image with metadata (test name, step, type).

</details>

---

<a name="drivers"></a>

🌐 Browser drivers

The runner can talk to multiple browser engines through different drivers. The default is auto β€” it probes each pool URL and picks the right driver per pool.

| Driver | Engine | Detection probe | When to use | |--------|--------|-----------------|-------------| | browserless | Real Chromium via browserless | /pressure returns JSON | Default. Production-grade JS execution, screencast, full Chrome behavior | | cdp | Generic CDP-compatible (raw Chrome, etc.) | /json/version reachable | Fallback for any CDP server that isn't one of the others | | lightpanda | Lightpanda (Zig) | /json/version Browser=lightpanda | ~9Γ— faster, ~16Γ— less memory than headless Chrome β€” ideal for high-volume scrape-style tests | | obscura | Obscura (Rust + V8) | /json/version Browser=obscura | ~30 MB RAM footprint, built-in anti-detection (--stealth), stays close to real Chrome via Puppeteer | | steel | Steel Browser | /v1/sessions returns JSON | Managed session lifecycle, REST API for orchestration |

<details> <summary><strong>Pick a driver per test / force one per run</strong></summary>

<br/>

{
  "tests": [
    {
      "name": "checkout flow (heavy JS, real Chrome)",
      "driver": "browserless",
      "actions": [...]
    },
    {
      "name": "scrape product page (lightweight)",
      "driver": "obscura",
      "fallbackDriver": "cdp",
      "actions": [...]
    }
  ]
}

driver is optional. If set, only pools whose detected driver matches become candidates. fallbackDriver is explicit opt-in β€” without it, a missing target driver fails the test with a clear message. Pool busyness does not trigger fallback; the runner waits inside the filtered set.

Force a driver for a whole run (CLI overrides win over per-test fields β€” useful for A/B benchmarks):

e2e-runner run --all --driver obscura
e2e-runner run --all --driver obscura --fallback-driver cdp

</details>

<details> <summary><strong>Running each driver locally</strong></summary>

<br/>

# browserless (default) β€” managed by `pool start`
e2e-runner pool start

# Lightpanda β€” pool start uses templates/docker-compose-lightpanda.yml
e2e-runner pool start                 # with poolDriver: 'lightpanda' in config

# Obscura β€” install the binary and run it yourself
curl -LO https://github.com/h4ckf0r0day/obscura/releases/latest/download/obscura-x86_64-linux.tar.gz
tar xzf obscura-x86_64-linux.tar.gz
./obscura serve --port 9222 --stealth
# then point the runner at it: poolUrls: ['http://localhost:9222'], poolDriver: 'obscura'

</details>

---

<a name="reference"></a>

βš™οΈ CLI, config &amp; CI

<details> <summary><strong>CLI commands</strong></summary>

<br/>

# Run tests
e2e-runner run --all                  # All suites
e2e-runner run --suite auth           # Single suite
e2e-runner run --tests path/to.json   # Specific file
e2e-runner run --inline '<json>'      # Inline JSON

# Pool management (CLI only, not MCP)
e2e-runner pool start                 # Start Chrome container
e2e-runner pool stop                  # Stop Chrome container
e2e-runner pool status                # Check pool health

# Issue-to-test
e2e-runner issue <url>                # Fetch issue
e2e-runner issue <url> --generate     # Generate test via AI
e2e-runner issue <url> --verify       # Generate + run + report

# Dashboard
e2e-runner dashboard                  # Start web dashboard

# Other
e2e-runner list                       # List available suites
e2e-runner capture <url>              # On-demand screenshot
e2e-runner init                       # Scaffold project

</details>

<details> <summary><strong>CLI options</strong></summary>

<br/>

| Flag | Default | Description | |------|---------|-------------| | --base-url <url> | http://host.docker.internal:3000 | Application base URL | | --pool-url <ws> | ws://localhost:3333 | Chrome pool WebSocket URL | | --concurrency <n> | 3 | Parallel test workers | | --retries <n> | 0 | Retry failed tests N times | | --action-retries <n> | 0 | Retry failed actions N times | | --test-timeout <ms> | 60000 | Per-test timeout | | --timeout <ms> | 10000 | Default action timeout | | --output <format> | json | Report: json, junit, both | | --env <name> | default | Environment profile | | --fail-on-network-error | false | Fail tests with network errors | | --project-name <name> | dir name | Project display name | | --driver <name> | _(per-test)_ | Force pool driver for the run: browserless, cdp, lightpanda, obscura, steel | | --fallback-driver <name> | _none_ | Explicit fallback if no pool with --driver is reachable |

</details>

<details> <summary><strong>Configuration</strong> β€” <code>e2e.config.js</code> &amp; priority</summary>

<br/>

Create e2e.config.js in your project root:

export default {
  baseUrl: 'http://host.docker.internal:3000',
  concurrency: 4,
  retries: 2,
  actionRetries: 1,
  testTimeout: 30000,
  outputFormat: 'both',
  failOnNetworkError: true,
  exclude: ['explore-*', 'debug-*'],

  hooks: {
    beforeEach: [{ type: 'goto', value: '/' }],
  },

  environments: {
    staging: { baseUrl: 'https://staging.example.com' },
    production: { baseUrl: 'https://example.com', concurrency: 5 },
  },
};

Config priority (highest wins):

  1. CLI flags
  2. Environment variables
  3. Config file (e2e.config.js or e2e.config.json)
  4. Defaults

When --env <name> is set, the matching profile overrides everything.

</details>

<details> <summary><strong>CI/CD</strong> β€” JUnit XML &amp; GitHub Actions</summary>

<br/>

e2e-runner run --all --output junit
jobs:
  e2e:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 20
      - run: npm ci
      - run: npx e2e-runner pool start
      - run: npx e2e-runner run --all --output junit
      - uses: mikepenz/action-junit-report@v4
        if: always()
        with:
          report_paths: e2e/screenshots/junit.xml

</details>

<details> <summary><strong>Programmatic API</strong></summary>

<br/>

import { createRunner } from '@matware/e2e-runner';

const runner = await createRunner({ baseUrl: 'http://localhost:3000' });

const report = await runner.runAll();
const report = await runner.runSuite('auth');
const report = await runner.runFile('e2e/tests/login.json');
const report = await runner.runTests([
  { name: 'quick-check', actions: [{ type: 'goto', value: '/' }] },
]);

</details>

---

Requirements

  • Node.js >= 20
  • Docker β€” only for Option 3 (the parallel Chrome pool). Options 1 &amp; 2 don't need it.

License

Copyright 2026 Matias Aguirre (fastslack) β€” Matware

Licensed under the Apache License, Version 2.0. See LICENSE for details.

Related MCP servers

Browse all β†’