codereview-mcp
  
An MCP server that lets an AI coding agent review code with a language model. Point it at a git diff, a file, or a snippet and it returns a structured review with severity levels. It works with any MCP client (Claude Code, Cursor, VS Code, Continue, Windsurf) and supports Ollama (local, the default), OpenAI, Anthropic, OpenRouter, and any OpenAI-compatible server (llama.cpp, vLLM, LM Studio, and similar).
With the default Ollama backend, code never leaves your machine.
What it does
- Reviews
git diffoutput, parsing it per file and skipping binaries and deletions - Reviews whole files, with language auto-detected from the extension (50+ languages)
- Reviews inline snippets in a language you specify
- Returns findings tagged by severity: CRITICAL, WARNING, SUGGESTION, NOTE
- Retries transient provider failures (rate limits, timeouts, 5xx) with backoff
It is a reviewer, not a gate. The output is advice from a model and should be read by a human, not wired straight into an automatic merge. See SECURITY.md for the threat model, including prompt injection.
Install
Requires Python 3.10+. Install from the repository:
pip install git+https://github.com/lfylow/codereview-mcp
Or from a clone:
git clone https://github.com/lfylow/codereview-mcp
cd codereview-mcp
pip install .
For the default local backend, install Ollama and pull a model. A coding-tuned model gives noticeably better reviews than a small general model:
ollama pull qwen2.5-coder:7b # recommended
# or the smaller default, lighter on RAM:
ollama pull llama3.2:3b
Then run the server (it speaks MCP over stdio, so it's normally launched by a client rather than by hand):
codereview-mcp --model qwen2.5-coder:7b
To see which models are installed on your Ollama server:
codereview-mcp --list-models
Connect an agent
Claude Code
claude mcp add codereview-mcp -- codereview-mcp
To use a hosted provider, pass the environment through:
claude mcp add codereview-mcp --env LLM_PROVIDER=openai --env OPENAI_API_KEY=sk-... -- codereview-mcp
VS Code (Copilot) — .vscode/mcp.json
{
"servers": {
"codereview-mcp": { "type": "stdio", "command": "codereview-mcp" }
}
}
Cursor — .cursor/mcp.json
{
"mcpServers": {
"codereview-mcp": { "command": "codereview-mcp" }
}
}
Continue / Windsurf
Both use the same shape as Cursor: an mcpServers entry with "command": "codereview-mcp". Add provider keys under an "env" object on that entry if you aren't using Ollama.
Once connected, ask the agent things like "review my staged changes" or "review src/auth.py" and it will call the matching tool.
Tools
| Tool | Use it for | |---|---| | review_git_diff(diff) | Output of git diff. Reviewed per file. | | review_code_file(filepath) | A file on disk. Language detected from the extension. | | review_code_snippet(code, language="python") | A piece of code not yet in a file. |
Providers
| Provider | Set LLM_PROVIDER to | Connection | Notes | |---|---|---|---| | Ollama | ollama (default) | local server, no key | Code stays on your machine | | OpenAI | openai | OPENAI_API_KEY | Also any OpenAI-compatible API via OPENAI_BASE_URL | | Anthropic | anthropic | ANTHROPIC_API_KEY | | | OpenRouter | openrouter | OPENROUTER_API_KEY | One key, many models — see openrouter.ai/models | | OpenAI-compatible (llama.cpp, vLLM, LM Studio) | openai + OPENAI_BASE_URL | usually no key | Local servers generally ignore the key |
Authentication is by API key. The hosted providers don't offer an official way to use an account subscription in place of an API key for programmatic API access, so that mode isn't supported.
# OpenRouter
LLM_PROVIDER=openrouter OPENROUTER_API_KEY=sk-or-... codereview-mcp
# local OpenAI-compatible server (e.g. LM Studio on :1234), no key needed
LLM_PROVIDER=openai OPENAI_BASE_URL=http://localhost:1234/v1 codereview-mcp --model my-local-model
Models
Review quality tracks the model. For local use, a coding-tuned model is worth the extra download:
| Model | Pull with | Notes | |---|---|---| | Qwen2.5-Coder | ollama pull qwen2.5-coder:7b | Strong all-round code model; :14b/:32b if you have the VRAM | | Qwen3-Coder | ollama pull qwen3-coder | Newer Qwen coding model | | DeepSeek-Coder V2 | ollama pull deepseek-coder-v2 | Good multi-language coverage | | Codestral | ollama pull codestral | Mistral's code model | | CodeLlama | ollama pull codellama | Widely available baseline | | Llama 3.2 3B | ollama pull llama3.2:3b | The default — small and fast, lighter reviews |
Run codereview-mcp --list-models to see what's installed locally. Pick a model with --model, OLLAMA_MODEL, or the config file.
Configuration
Configuration is read from defaults, then a config file, then environment variables, then CLI flags — each layer overriding the previous one.
Environment variables
| Variable | Default | Description | |---|---|---| | LLM_PROVIDER | ollama | ollama, openai, anthropic, or openrouter | | OLLAMA_BASE_URL | http://localhost:11434 | Ollama server URL | | OLLAMA_MODEL | llama3.2:3b | Ollama model name | | OPENAI_API_KEY | — | OpenAI API key (optional for local servers) | | OPENAI_MODEL | gpt-4o-mini | OpenAI model name | | OPENAI_BASE_URL | — | Custom OpenAI-compatible endpoint | | ANTHROPIC_API_KEY | — | Anthropic API key | | ANTHROPIC_MODEL | claude-sonnet-4-6 | Anthropic model name | | OPENROUTER_API_KEY | — | OpenRouter API key | | OPENROUTER_MODEL | qwen/qwen3-coder-30b-a3b-instruct | OpenRouter model slug | | OPENROUTER_BASE_URL | https://openrouter.ai/api/v1 | OpenRouter endpoint | | MAX_TOKENS | 4096 | Max tokens per response | | TEMPERATURE | 0.3 | Sampling temperature (0–2) | | REQUEST_TIMEOUT | 120 | Per-request timeout in seconds | | MAX_INPUT_CHARS | 100000 | Reject inputs larger than this | | STREAM | true | Stream responses; falls back to a single request if unsupported | | CUSTOM_PROMPT | — | Override the review prompt (must contain {code} and {language}) |
Config file
~/.config/codereview-mcp/config.yml on Linux/macOS (respects XDG_CONFIG_HOME), or %APPDATA%\codereview-mcp\config.yml on Windows:
llm_provider: ollama
ollama_model: llama3.2:3b
temperature: 0.2
max_tokens: 2048
CLI flags
codereview-mcp --help
codereview-mcp --version
codereview-mcp --provider openai --model gpt-4o
codereview-mcp --config /path/to/config.yml
codereview-mcp --list-models # list installed Ollama models and exit
codereview-mcp --no-stream # disable streaming
codereview-mcp --verbose
Example output
A review of a small Python file looks like this:
## Review: `src/database.py` (python)
🔴 CRITICAL — SQL injection
`f"SELECT * FROM users WHERE id = {user_id}"` interpolates user input into SQL.
Use a parameterized query: `cursor.execute("SELECT * FROM users WHERE id = ?", (user_id,))`
🟡 WARNING — connection never closed
The connection opened on line 2 is never closed. Use a context manager:
`with sqlite3.connect("users.db") as conn:`
🟢 SUGGESTION — use the built-in
`calculate_average` can be `sum(numbers) / len(numbers)`.
The exact wording depends on the model you run.
Troubleshooting
"connection refused" / transport errors — Ollama isn't running or isn't on the expected URL. Start it with ollama serve and check OLLAMA_BASE_URL.
Empty or low-quality reviews — small local models miss things. Try a larger model (OLLAMA_MODEL=qwen2.5-coder:7b) or a hosted provider.
"Input too large" — the file or diff exceeds MAX_INPUT_CHARS. Review a smaller chunk or raise the limit.
"API key is required" — set OPENAI_API_KEY or ANTHROPIC_API_KEY for the chosen provider.
Run with --verbose to see what the server is doing on stderr.
Limitations
- Review quality depends entirely on the backing model. Small local models are fast and
private but less thorough than large hosted ones.
- The tool reviews the added/changed lines of a diff with surrounding context, not the
full repository, so it can miss issues that span files.
- Output is non-deterministic and advisory. It is not a substitute for tests or human
review.
Development
git clone https://github.com/lfylow/codereview-mcp
cd codereview-mcp
pip install -e ".[dev]"
pytest
ruff check src/ tests/
ruff format --check src/ tests/
mypy
See CONTRIBUTING.md for more.
License
MIT — see LICENSE.






