screengrab-tool-mcp
A small Model Context Protocol server that lets an LLM see what's on your screen — either an entire monitor or a specific application window. The screen-capture tool is cross-platform; the window-level tools are Windows-only and use pywin32 under the hood.
Built with the official mcp Python SDK, mss for fast cross-platform capture, and Pillow for image processing. Packaged with uv so it runs with one command via uvx.
---
Tools
capture_screen (cross-platform)
Capture an entire monitor and return a downsized PNG.
| Parameter | Type | Default | Description | | --------------- | ------- | ------- | -------------------------------------------------------------------------------------------------- | | delay_seconds | integer | 0 | Seconds to wait before capturing. Range 0–60. | | display_index | integer | 1 | 0 = all monitors stitched, 1 = primary, 2 = secondary, etc. Out-of-range values fall back to primary. |
list_windows (Windows-only)
Enumerate visible top-level windows on the desktop, sorted most-recently-focused first. Returns a JSON array of {hwnd, title, process, pid} so the model can pick the one it wants. No parameters.
capture_window (Windows-only)
Capture a specific window matched by query string.
| Parameter | Type | Default | Description | | --------------- | ------- | -------- | -------------------------------------------------------------------------------------------------------------------------------------------- | | query | string | required | Title substring (case-insensitive) tried first; then process name (e.g. "code" matches Code.exe). Most recently focused wins on multi-match. | | remember_as | string | | Optional friendly name to store the resolved window under for the rest of this server process. Recall later via capture_remembered. | | delay_seconds | integer | 0 | Seconds to wait before capturing. Range 0–60. |
The capture pipeline tries PrintWindow first (no focus disturb). If that returns a near-blank frame — common with hardware-accelerated apps like browsers and Electron — it falls back to briefly bringing the window forward, grabbing its rect with mss, then restoring the previous foreground window.
capture_remembered (Windows-only)
Capture a previously remembered window by friendly name.
| Parameter | Type | Default | Description | | --------------- | ------- | -------- | ---------------------------------------------------------------------------- | | name | string | required | The name passed to capture_window's remember_as. last is always valid after the first successful window capture. | | delay_seconds | integer | 0 | Seconds to wait before capturing. Range 0–60. |
Aliases live in process memory only — they vanish when the server restarts.
save_last_capture
Persists a recent capture from the in-memory ring buffer to a file in the workspace. Captures from capture_screen, capture_window, and capture_remembered are pushed into a 5-slot ring as a side effect; this tool writes one of them to disk.
| Parameter | Type | Default | Description | |---|---|---|---| | index | int | 0 | Ring slot. 0 = most recent. Max 4. | | name | str | (auto) | Filename (no path separators). .png appended if missing. | | dir | str | (env or ./screenshots/) | Destination directory. |
Directory resolution: dir arg → SCREENGRAB_SAVE_DIR env var → ./screenshots/. Auto-created if missing. Filename collisions auto-suffix (foo.png → foo-1.png).
---
Returned images are downscaled with Pillow.Image.thumbnail((2000, 2000)) before being base64-encoded as a PNG, so you don't blow up the context window with a raw 4K screen grab.
---
Install / run
You don't need to install anything globally. With uv present on your PATH, the server runs straight from a local checkout via uvx:
# from inside the project directory
uvx --from . screengrab-tool-mcp
Or, once published to PyPI:
uvx screengrab-tool-mcp
The server speaks MCP over stdio. All logs go to stderr — never stdout — so it's safe to pipe directly into an MCP client.
---
Adding it to your MCP client
Claude Code (CLI)
claude mcp add screengrab -- uvx --from C:/Users/mahaffey/screengrab-tool-mcp screengrab-tool-mcp
Or, if installed from PyPI:
claude mcp add screengrab -- uvx screengrab-tool-mcp
To remove it later:
claude mcp remove screengrab
Claude Desktop
Edit your claude_desktop_config.json (Settings → Developer → Edit Config):
{
"mcpServers": {
"screengrab": {
"command": "uvx",
"args": ["--from", "C:/Users/mahaffey/screengrab-tool-mcp", "screengrab-tool-mcp"]
}
}
}
Once published to PyPI, simplify to:
{
"mcpServers": {
"screengrab": {
"command": "uvx",
"args": ["screengrab-tool-mcp"]
}
}
}
VS Code / GitHub Copilot
Add to your .vscode/mcp.json (workspace) or user-level mcp.json:
{
"servers": {
"screengrab": {
"type": "stdio",
"command": "uvx",
"args": ["--from", "C:/Users/mahaffey/screengrab-tool-mcp", "screengrab-tool-mcp"]
}
}
}
Cursor
Add to ~/.cursor/mcp.json:
{
"mcpServers": {
"screengrab": {
"command": "uvx",
"args": ["screengrab-tool-mcp"]
}
}
}
---
Development
# install dependencies into a local venv
uv sync
# run directly
uv run screengrab-tool-mcp
# or run the module
uv run python -m screengrab_tool_mcp.server
Project layout
screengrab-tool-mcp/
├── pyproject.toml # uv / hatchling packaging + script entry point
├── README.md
├── src/
│ └── screengrab_tool_mcp/
│ ├── __init__.py
│ ├── server.py # MCP wiring, tool dispatch, stdio entry
│ ├── capture.py # capture_screen and capture_window pipelines
│ ├── windows.py # Win32 enumeration and query resolution
│ └── aliases.py # in-memory window-alias store
└── tests/
├── test_aliases.py
├── test_windows.py
└── test_capture.py
---
Notes & gotchas
- stdout is sacred. The MCP stdio transport multiplexes JSON-RPC over
stdout. The server configures logging.basicConfig(stream=sys.stderr, ...) for exactly this reason. If you add print(...) calls, route them to sys.stderr or you will crash the connection.
- Permissions on macOS. macOS requires the parent process (Claude
Desktop, VS Code, Terminal, etc.) to have Screen Recording permission in System Settings → Privacy & Security. Grant it once and restart the host.
- Wayland on Linux.
mssuses X11. On a pure-Wayland session, capture
will be limited to the Xwayland surface; XWayland or an X11 session works best.
- Display indexing.
mssusesmonitors[0]for the virtual all-monitors
rectangle and monitors[1..N] for individual displays. Index 1 is the primary display.
- Window tools are Windows-only.
list_windows,capture_window, and
capture_remembered use pywin32 and assume a real interactive desktop session. They will return a clear error on non-Windows hosts. capture_screen remains cross-platform.
License
MIT






