screengrab-tool-mcp

jaimemahaffey/screengrab-tool-mcp
0 starsCommunity

Install to Claude Code

This server doesn't publish a one-line install command. Follow the setup in the source repository.

Summary

Lets an LLM see what's on your screen by capturing an entire monitor or a specific application window.

README.md

screengrab-tool-mcp

A small Model Context Protocol server that lets an LLM see what's on your screen — either an entire monitor or a specific application window. The screen-capture tool is cross-platform; the window-level tools are Windows-only and use pywin32 under the hood.

Built with the official mcp Python SDK, mss for fast cross-platform capture, and Pillow for image processing. Packaged with uv so it runs with one command via uvx.

---

Tools

capture_screen (cross-platform)

Capture an entire monitor and return a downsized PNG.

| Parameter | Type | Default | Description | | --------------- | ------- | ------- | -------------------------------------------------------------------------------------------------- | | delay_seconds | integer | 0 | Seconds to wait before capturing. Range 060. | | display_index | integer | 1 | 0 = all monitors stitched, 1 = primary, 2 = secondary, etc. Out-of-range values fall back to primary. |

list_windows (Windows-only)

Enumerate visible top-level windows on the desktop, sorted most-recently-focused first. Returns a JSON array of {hwnd, title, process, pid} so the model can pick the one it wants. No parameters.

capture_window (Windows-only)

Capture a specific window matched by query string.

| Parameter | Type | Default | Description | | --------------- | ------- | -------- | -------------------------------------------------------------------------------------------------------------------------------------------- | | query | string | required | Title substring (case-insensitive) tried first; then process name (e.g. "code" matches Code.exe). Most recently focused wins on multi-match. | | remember_as | string | | Optional friendly name to store the resolved window under for the rest of this server process. Recall later via capture_remembered. | | delay_seconds | integer | 0 | Seconds to wait before capturing. Range 060. |

The capture pipeline tries PrintWindow first (no focus disturb). If that returns a near-blank frame — common with hardware-accelerated apps like browsers and Electron — it falls back to briefly bringing the window forward, grabbing its rect with mss, then restoring the previous foreground window.

capture_remembered (Windows-only)

Capture a previously remembered window by friendly name.

| Parameter | Type | Default | Description | | --------------- | ------- | -------- | ---------------------------------------------------------------------------- | | name | string | required | The name passed to capture_window's remember_as. last is always valid after the first successful window capture. | | delay_seconds | integer | 0 | Seconds to wait before capturing. Range 060. |

Aliases live in process memory only — they vanish when the server restarts.

save_last_capture

Persists a recent capture from the in-memory ring buffer to a file in the workspace. Captures from capture_screen, capture_window, and capture_remembered are pushed into a 5-slot ring as a side effect; this tool writes one of them to disk.

| Parameter | Type | Default | Description | |---|---|---|---| | index | int | 0 | Ring slot. 0 = most recent. Max 4. | | name | str | (auto) | Filename (no path separators). .png appended if missing. | | dir | str | (env or ./screenshots/) | Destination directory. |

Directory resolution: dir arg → SCREENGRAB_SAVE_DIR env var → ./screenshots/. Auto-created if missing. Filename collisions auto-suffix (foo.pngfoo-1.png).

---

Returned images are downscaled with Pillow.Image.thumbnail((2000, 2000)) before being base64-encoded as a PNG, so you don't blow up the context window with a raw 4K screen grab.

---

Install / run

You don't need to install anything globally. With uv present on your PATH, the server runs straight from a local checkout via uvx:

# from inside the project directory
uvx --from . screengrab-tool-mcp

Or, once published to PyPI:

uvx screengrab-tool-mcp

The server speaks MCP over stdio. All logs go to stderr — never stdout — so it's safe to pipe directly into an MCP client.

---

Adding it to your MCP client

Claude Code (CLI)

claude mcp add screengrab -- uvx --from C:/Users/mahaffey/screengrab-tool-mcp screengrab-tool-mcp

Or, if installed from PyPI:

claude mcp add screengrab -- uvx screengrab-tool-mcp

To remove it later:

claude mcp remove screengrab

Claude Desktop

Edit your claude_desktop_config.json (Settings → Developer → Edit Config):

{
  "mcpServers": {
    "screengrab": {
      "command": "uvx",
      "args": ["--from", "C:/Users/mahaffey/screengrab-tool-mcp", "screengrab-tool-mcp"]
    }
  }
}

Once published to PyPI, simplify to:

{
  "mcpServers": {
    "screengrab": {
      "command": "uvx",
      "args": ["screengrab-tool-mcp"]
    }
  }
}

VS Code / GitHub Copilot

Add to your .vscode/mcp.json (workspace) or user-level mcp.json:

{
  "servers": {
    "screengrab": {
      "type": "stdio",
      "command": "uvx",
      "args": ["--from", "C:/Users/mahaffey/screengrab-tool-mcp", "screengrab-tool-mcp"]
    }
  }
}

Cursor

Add to ~/.cursor/mcp.json:

{
  "mcpServers": {
    "screengrab": {
      "command": "uvx",
      "args": ["screengrab-tool-mcp"]
    }
  }
}

---

Development

# install dependencies into a local venv
uv sync

# run directly
uv run screengrab-tool-mcp

# or run the module
uv run python -m screengrab_tool_mcp.server

Project layout

screengrab-tool-mcp/
├── pyproject.toml          # uv / hatchling packaging + script entry point
├── README.md
├── src/
│   └── screengrab_tool_mcp/
│       ├── __init__.py
│       ├── server.py       # MCP wiring, tool dispatch, stdio entry
│       ├── capture.py      # capture_screen and capture_window pipelines
│       ├── windows.py      # Win32 enumeration and query resolution
│       └── aliases.py      # in-memory window-alias store
└── tests/
    ├── test_aliases.py
    ├── test_windows.py
    └── test_capture.py

---

Notes & gotchas

  • stdout is sacred. The MCP stdio transport multiplexes JSON-RPC over

stdout. The server configures logging.basicConfig(stream=sys.stderr, ...) for exactly this reason. If you add print(...) calls, route them to sys.stderr or you will crash the connection.

  • Permissions on macOS. macOS requires the parent process (Claude

Desktop, VS Code, Terminal, etc.) to have Screen Recording permission in System Settings → Privacy & Security. Grant it once and restart the host.

  • Wayland on Linux. mss uses X11. On a pure-Wayland session, capture

will be limited to the Xwayland surface; XWayland or an X11 session works best.

  • Display indexing. mss uses monitors[0] for the virtual all-monitors

rectangle and monitors[1..N] for individual displays. Index 1 is the primary display.

  • Window tools are Windows-only. list_windows, capture_window, and

capture_remembered use pywin32 and assume a real interactive desktop session. They will return a clear error on non-Windows hosts. capture_screen remains cross-platform.

License

MIT

Related MCP servers

Browse all →