Enables text-only agents to process images by accepting image files, base64 data, or URLs, sending them to multimodal models, and returning structured text results via MCP.
Getting started
Add VisionToolMCP to your MCP-capable client — Claude Code, Cursor, Codex, and others — by following the setup at the source, which documents the exact command, configuration, and any required API keys.






