Remote OpenClaw Blog

Claude Image Generation: How to Actually Get It in 2026

6 min read · 20 October 2018

Claude cannot generate images: Anthropic's own documentation states that "Claude is an image understanding model only. It can interpret and analyze images, but it cannot generate, produce, edit, manipulate, or create images." The practical workaround in 2026 is to connect an image generation MCP server, so Claude Code or OpenClaw writes the prompt and a model like gpt-image-1, Flux, or a local ComfyUI pipeline renders the picture.

Can Claude Generate Images?

No, Claude does not generate images natively in claude.ai, the API, Claude Code, or OpenClaw. The official vision documentation is explicit in its FAQ: asked "Can Claude generate or edit images?", Anthropic answers "No, Claude is an image understanding model only."

Claude can produce visual output in indirect ways that sometimes get confused with image generation: it writes SVG markup, HTML and CSS mockups, Mermaid diagrams, and matplotlib scripts that render to PNG. Those are code paths, not a diffusion model. When people say they generate images "with Claude," they almost always mean Claude orchestrating an external image model through a tool call, which is exactly what the rest of this guide sets up.

What Claude Can Do With Images

Claude's image support is input-only, and within that lane it is strong: it analyzes screenshots, charts, forms, UI designs, and photos you send it. Per the vision docs, Claude accepts JPEG, PNG, GIF, and WebP, with a maximum of 10 MB per image on the Claude API, dimensions up to 8000x8000 px, and up to 100 images per API request on models with a 200k-token context window.

Images are metered as visual tokens: each 28x28-pixel patch costs one token, so a 1000x1000 px image costs about 1,296 tokens. In an agent workflow this vision ability is the other half of the image generation loop. Claude generates an image through an MCP tool, looks at the result with vision, critiques it, and regenerates with a refined prompt, all without you leaving the terminal.

How to Add Image Generation to Claude Code

Adding image generation to Claude Code means registering an MCP server that wraps an image model's API, using the standard claude mcp add flow covered in our Claude Code MCP guide. Three real, verified servers cover the main backends:

# Replicate (Flux Schnell + SVG): needs a Replicate API token
claude mcp add replicate-flux -e REPLICATE_API_TOKEN=your_token -- npx -y replicate-flux-mcp

# ComfyUI (local, no API key): install as a Claude Code plugin
/plugin marketplace add artokun/comfyui-mcp
/plugin install comfy

# OpenAI gpt-image-1: clone and build, then point your MCP config
# at the built index.js with OPENAI_API_KEY set
git clone https://github.com/SureScaleAI/openai-gpt-image-mcp

replicate-flux-mcp defaults to black-forest-labs/flux-schnell for raster images and recraft-ai/recraft-v3-svg for vector output, and exposes nine tools including generate_image, generate_image_variants, and generate_svg. openai-gpt-image-mcp exposes create-image and edit-image against OpenAI's gpt-image-1 API, returning base64 or saving to disk (it switches to file output automatically past the 1 MB MCP payload limit). comfyui-mcp is the local-first route: 108 tools and 29 skills for driving a ComfyUI instance running on your own hardware, with curated support for Flux, WAN, Qwen, and Stable Diffusion model families.

The same servers work in OpenClaw and any other MCP-capable agent, since MCP is a client-agnostic protocol.

Image Generation MCP Servers Compared

The right server depends on which backend you already pay for and whether you have a GPU; here is how the verified options line up, with star counts from GitHub as of July 2026.

Server	Backend	Stars	Cost model	Best for
replicate-flux-mcp	Replicate (Flux Schnell, Recraft SVG)	103	Replicate per-run billing	Fast setup via npx, raster plus SVG output
openai-gpt-image-mcp	OpenAI gpt-image-1 (also Azure OpenAI)	103	OpenAI API per-image pricing	Generation plus mask-based image editing
comfyui-mcp	Local ComfyUI (Flux, SDXL, Qwen, WAN)	242	Free per image, your GPU and power	Unlimited local generation, full workflow control
mcp-hfspace	Hugging Face Spaces (including FLUX.1-schnell)	386	Free tier available on public Spaces	Trying image generation without paid API keys

Which Option Should You Pick?

For most Claude Code users, replicate-flux-mcp is the fastest working setup: one npx command, one API token, and Flux Schnell is both cheap and quick on Replicate. Choose openai-gpt-image-mcp when you specifically need editing with masks or already run OpenAI or Azure billing. Choose comfyui-mcp when you have an NVIDIA or Apple Silicon GPU and want unlimited generations, custom workflows, or LoRA-based styles without per-image fees; note it requires Node.js 22+ and a running ComfyUI install.

If you want to survey more options before committing, the MCP directory indexes thousands of servers, including dozens of image generation wrappers for Gemini image models, Amazon Bedrock Nova Canvas, and other backends. Our guide to finding MCP servers from inside Claude Code shows how to evaluate them before installing.

Limitations and Tradeoffs

MCP-based image generation is a bolt-on, and it inherits the limits of whichever backend you wire in. Cloud servers bill per image, need API keys stored in your MCP config, and send your prompts to a third party, which matters for confidential product work. Local ComfyUI avoids all of that but trades it for GPU requirements, model downloads measured in gigabytes, and real maintenance overhead.

Also set expectations for the Claude side. Claude never "sees" the image the way the generator does; it sees the file after the fact through vision, so iterating on fine details can take several generate-inspect-regenerate rounds and each inspection costs visual tokens. And the smaller community servers here have low star counts, so review the source before trusting one with an API key. When you only need diagrams, icons, or UI mockups, skip image generation entirely; Claude writes SVG and HTML/CSS natively and the result is editable text.

Related Guides

Go deeper

The operator playbooks

Production-ready PDF guides for OpenClaw and Hermes Agent — $19.99 each.

The OpenClaw Operator Guide →

The Hermes Agent Playbook →

Skills for this topic

Browse all skills →

image-generationclaude-office-skills/skills3K installs stable-diffusion-image-generationdavila7/claude-code-templates1K installs ai-image-generationhalt-catch-fire/skills276K installs ai-image-generationagentspace-so/runcomfy-agent-skills275K installs ai-image-generationruncomfy-com/skills223K installs ai-image-generationdoany-ai/skills222K installs

Frequently Asked Questions

Can Claude generate images like DALL-E?

No. Claude is an image understanding model only; Anthropic's documentation states it cannot generate, produce, edit, manipulate, or create images. To get DALL-E-style output in a Claude workflow, connect an MCP server that wraps an image model such as OpenAI's gpt-image-1 or Flux on Replicate.

How do I generate images in Claude Code?

Add an image generation MCP server. For example, run claude mcp add replicate-flux -e REPLICATE_API_TOKEN=your_token -- npx -y replicate-flux-mcp , then ask Claude to generate an image and it will call the server's generate_image tool with a prompt it writes for you.

Is there a free way to add image generation to Claude?

Yes, two. Run ComfyUI locally with the comfyui-mcp plugin, which is free per image if you have a capable GPU, or use mcp-hfspace to call public Hugging Face Spaces such as FLUX.1-schnell, which has a free tier subject to rate limits.

What image formats does Claude support as input?

Claude accepts JPEG, PNG, GIF, and WebP images. Limits on the Claude API are 10 MB per image, dimensions up to 8000x8000 px, and up to 100 images per request on 200k-context models. Animated GIFs are read as their first frame only.

Loading article