Deep Research MCP
      
A Python-based agent that integrates research providers with Claude Code through the Model Context Protocol (MCP). It supports OpenAI (Responses API with web search and code interpreter, or Chat Completions API for broad provider compatibility), Gemini Deep Research via the Interactions API, Allen AI's DR-Tulu research agent, and the open-source Open Deep Research stack (based on smolagents).
Prerequisites
- Python 3.11+
- uv installed
- One of:
- OpenAI API access (Responses API models, e.g.,
o4-mini-deep-research-2025-06-26) - Gemini API access with the Interactions API / Deep Research agent enabled
- DR-Tulu service running locally or remotely (see DR-Tulu setup)
- Open Deep Research dependencies (installed via
uv sync --extra open-deep-research) - Claude Code, or any other assistant supporting MCP integration
Installation
Recommended setup (resolves the latest compatible versions):
# Install runtime dependencies + project in editable mode
uv sync --upgrade
# Development tooling (pytest, black, pylint, mypy, pre-commit)
uv sync --upgrade --extra dev
# Enable the pre-commit hook so black runs automatically before each commit
uv run pre-commit install
# Optional docs tooling
uv sync --upgrade --extra docs
# Optional Open Deep Research provider dependencies
uv sync --upgrade --extra open-deep-research
Compatibility setup (pip-based):
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
pip install -e .
Code Layout
src/deep_research_mcp/agent.py: orchestration layer; owns clarification, instruction building, callbacks, and delegates provider work to backendssrc/deep_research_mcp/backends/: provider-specific implementations for OpenAI, Gemini, DR-Tulu, and Open Deep Researchsrc/deep_research_mcp/mcp_server.py: FastMCP server and tool entrypointssrc/deep_research_mcp/clarification.py: clarification agents, sessions, and enrichment flowsrc/deep_research_mcp/prompts/: YAML prompt templates used by clarification and instruction buildingcli/deep-research-cli.py: unified CLI for agent mode, MCP client mode, and configuration viewingcli/deep-research-tui.py: interactive full-screen terminal UI for clarification, research, status checks, and saving output to disktests/:pytestsuite covering configuration, MCP integration, prompts, results, and clarification flows
Configuration
Configuration File
Create a ~/.deep_research file in your home directory using TOML format.
Library note: ResearchConfig.load() explicitly reads this file and applies environment variable overrides. ResearchConfig.from_env() reads environment variables only.
Common settings:
[research] # Core Deep Research functionality
provider = "openai" # Available options: "openai", "dr-tulu", "gemini", "open-deep-research" -- defaults to "openai"
api_style = "responses" # Only applies to provider="openai"; use "chat_completions" for Perplexity, Groq, Ollama, etc.
model = "o4-mini-deep-research-2025-06-26" # OpenAI: model identifier; Dr Tulu: logical provider id; Gemini: agent id; ODR: LiteLLM model identifier
api_key = "your-api-key" # API key, optional
base_url = "https://api.openai.com/v1" # OpenAI: OpenAI-compatible endpoint; Dr Tulu: service base URL; Gemini: https://generativelanguage.googleapis.com; ODR: LiteLLM-compatible endpoint
# Task behavior
timeout = 1800
poll_interval = 30
cancel_on_timeout = false # When true, cancel the provider task if it exceeds timeout. Default false: the task keeps running and the result can be recovered with research_status
# Largely based on https://cookbook.openai.com/examples/deep_research_api/introduction_to_deep_research_api_agents
[clarification] # Optional query clarification component
enable = true
triage_model = "gpt-5-mini"
clarifier_model = "gpt-5-mini"
instruction_builder_model = "gpt-5-mini"
api_key = "YOUR_OPENAI_API_KEY" # Optional, overrides api_key
base_url = "https://api.openai.com/v1" # Optional, overrides base_url
[logging]
level = "INFO"
Note on precedence: [research] api_key and base_url map to the RESEARCH_API_KEY and RESEARCH_BASE_URL settings, which apply to every provider. If you switch provider via an environment override (e.g. RESEARCH_PROVIDER=openai) while the file is configured for another provider, also override RESEARCH_API_KEY and RESEARCH_BASE_URL, otherwise the file's key and endpoint are used.
OpenAI provider example:
[research]
provider = "openai"
model = "o4-mini-deep-research-2025-06-26" # OpenAI model
api_key = "YOUR_OPENAI_API_KEY" # Defaults to OPENAI_API_KEY
base_url = "https://api.openai.com/v1" # OpenAI-compatible endpoint
timeout = 1800
poll_interval = 30
Gemini Deep Research provider example:
[research]
provider = "gemini"
model = "deep-research-preview-04-2026" # Gemini Deep Research agent id
api_key = "YOUR_GEMINI_API_KEY" # Defaults to GEMINI_API_KEY or GOOGLE_API_KEY
base_url = "https://generativelanguage.googleapis.com"
timeout = 1800
poll_interval = 30
Dr Tulu provider example:
[research]
provider = "dr-tulu"
model = "dr-tulu" # Logical provider model id; currently informational
base_url = "http://localhost:8080/" # Dr Tulu service base URL; the backend calls /chat
api_key = "" # Optional; defaults to RESEARCH_API_KEY / DR_TULU_API_KEY if set
timeout = 1800
poll_interval = 30
Running Dr Tulu locally:
- Clone and configure
dr-tuluseparately. - In
dr-tulu/agent/.env, set at least:
SERPER_API_KEYJINA_API_KEYS2_API_KEY(optional but recommended)
- Start the DR-Tulu model server:
cd /path/to/dr-tulu/agent
conda run -n vllm bash -lc '
CUDA_VISIBLE_DEVICES=0 \
vllm serve rl-research/DR-Tulu-8B \
--port 30001 \
--dtype auto \
--max-model-len 16384 \
--gpu-memory-utilization 0.60 \
--enforce-eager
'
- Start the Dr Tulu MCP backend:
cd /path/to/dr-tulu/agent
conda run -n vllm python -m dr_agent.mcp_backend.main --port 8000
- Start the Dr Tulu app service:
cd /path/to/dr-tulu/agent
conda run -n vllm python workflows/auto_search_sft.py serve \
--port 8080 \
--config workflows/auto_search_sft.yaml \
--config-overrides "search_agent_max_tokens=12000,browse_agent_max_tokens=12000"
- Point
deep-research-mcpat that service:
[research]
provider = "dr-tulu"
base_url = "http://localhost:8080/"
timeout = 1800
The dr-tulu backend calls POST {base_url}/chat, so if you front Dr Tulu behind a different host, port, or reverse proxy, update base_url accordingly.
Perplexity (via Sonar Deep Research and Perplexity's OpenAI-compatible endpoint) provider example:
[research]
provider = "openai"
api_style = "chat_completions" # Required for Perplexity (no Responses API)
model = "sonar-deep-research" # Perplexity's Sonar Deep Research
api_key = "ppl-..." # Defaults to OPENAI_API_KEY
base_url = "https://api.perplexity.ai" # Perplexity's OpenAI-compatible endpoint
timeout = 1800
Open Deep Research provider example:
[research]
provider = "open-deep-research"
model = "openai/qwen/qwen3-coder-30b" # LiteLLM-compatible model id
base_url = "http://localhost:1234/v1" # LiteLLM-compatible endpoint (local or remote)
api_key = "" # Optional if endpoint requires it
timeout = 1800
Ollama (local) provider example:
[research]
provider = "openai"
api_style = "chat_completions"
model = "llama3.1" # Any model available in your Ollama instance
base_url = "http://localhost:11434/v1" # Ollama's OpenAI-compatible endpoint
api_key = "" # Not required for local Ollama
timeout = 600
llama-server (local llama.cpp server) provider example:
[research]
provider = "openai"
api_style = "chat_completions"
model = "qwen2.5-0.5b" # Must match the --alias passed to llama-server
base_url = "http://127.0.0.1:8081/v1" # llama-server OpenAI-compatible endpoint
api_key = "test" # Must match the --api-key passed to llama-server
timeout = 600
Generic OpenAI-compatible Chat Completions provider (Groq, Together AI, vLLM, etc.):
[research]
provider = "openai"
api_style = "chat_completions"
model = "your-model-name"
api_key = "your-api-key"
base_url = "https://api.your-provider.com/v1"
timeout = 600
Optional env variables for Open Deep Research tools:
SERPAPI_API_KEYorSERPER_API_KEY: enable Google-style searchHF_TOKEN: optional, logs into Hugging Face Hub for gated models
Install The Included Skill In Claude Code Or Codex
This repository also ships a repo-specific skill guide at skills/deep-research-mcp/SKILL.md.
Installing that file as a local skill gives Claude Code or Codex a focused playbook for this repository's CLI, Python API, providers, and MCP server. It does not install Python dependencies, start the MCP server, or replace the provider configuration in ~/.deep_research. Treat it as complementary to the MCP setup below, not a substitute for it.
Claude Code skill install
Claude Code's skills docs use these locations:
- personal skill:
~/.claude/skills/<skill-name>/SKILL.md - project skill:
.claude/skills/<skill-name>/SKILL.md
Personal install:
mkdir -p ~/.claude/skills/deep-research-mcp
cp /path/to/deep-research-mcp/skills/deep-research-mcp/SKILL.md \
~/.claude/skills/deep-research-mcp/SKILL.md
If you prefer to keep the installed skill linked to this checkout:
mkdir -p ~/.claude/skills/deep-research-mcp
ln -sf /path/to/deep-research-mcp/skills/deep-research-mcp/SKILL.md \
~/.claude/skills/deep-research-mcp/SKILL.md
Project-local install from the repository root:
mkdir -p .claude/skills/deep-research-mcp
cp skills/deep-research-mcp/SKILL.md .claude/skills/deep-research-mcp/SKILL.md
After that, restart Claude Code or open a new session if the skill does not appear immediately. The skill can be invoked directly as /deep-research-mcp, and Claude can also load it automatically when the task matches the skill description. For a reusable shared extension, package the same skill into a Claude Code plugin instead of copying it by hand.
Official docs:
- Claude Code skills: <https://docs.claude.com/en/docs/claude-code/skills>
- Claude Code plugins: <https://docs.claude.com/en/docs/claude-code/plugins>
Codex skill install
Current Codex docs describe skills as skill folders containing SKILL.md, stored globally in $HOME/.agents/skills or repo-locally in .agents/skills.
Personal install:
mkdir -p ~/.agents/skills/deep-research-mcp
cp /path/to/deep-research-mcp/skills/deep-research-mcp/SKILL.md \
~/.agents/skills/deep-research-mcp/SKILL.md
Or keep the installed skill linked to this checkout:
mkdir -p ~/.agents/skills/deep-research-mcp
ln -sf /path/to/deep-research-mcp/skills/deep-research-mcp/SKILL.md \
~/.agents/skills/deep-research-mcp/SKILL.md
Project-local install from the repository root:
mkdir -p .agents/skills/deep-research-mcp
cp skills/deep-research-mcp/SKILL.md .agents/skills/deep-research-mcp/SKILL.md
If your Codex setup still follows older CLI conventions that use ~/.codex/skills or .codex/skills, mirror the same directory structure there instead.
Codex can invoke the skill implicitly when the task matches its description, or explicitly by mentioning $deep-research-mcp in the prompt. If the skill does not show up immediately, restart Codex.
Official docs:
- Codex skills: <https://developers.openai.com/codex/skills>
- Codex customization and skill locations: <https://developers.openai.com/codex/concepts/customization#skills>
Claude Code Integration
- Configure MCP Server
Choose one of the transports below.
Option A: stdio (recommended when Claude Code should spawn the server itself)
If your provider credentials are already stored in ~/.deep_research, the minimal setup is:
claude mcp add deep-research -- uv run --directory /path/to/deep-research-mcp deep-research-mcp
If you want Claude Code to pass OPENAI_API_KEY through to the spawned MCP process explicitly, use:
claude mcp add -e OPENAI_API_KEY="$OPENAI_API_KEY" \
deep-research -- \
uv run --directory /path/to/deep-research-mcp deep-research-mcp
Option B: HTTP (recommended when you want to run the server separately)
Start the server in one terminal:
OPENAI_API_KEY="$OPENAI_API_KEY" \
uv run --directory /path/to/deep-research-mcp \
deep-research-mcp --transport http --host 127.0.0.1 --port 8080
Then add the HTTP MCP server in Claude Code:
claude mcp add --transport http deep-research-http http://127.0.0.1:8080/mcp
Replace /path/to/deep-research-mcp/ with the actual path to your cloned repository. The verified Streamable HTTP endpoint is http://127.0.0.1:8080/mcp.
For multi-hour research, raise Claude Code's tool timeout before launching the CLI and rely on incremental status polls:
export MCP_TOOL_TIMEOUT=14400000 # 4 hours
claude --mcp-config ./.mcp.json
Kick off work with deep_research or research_with_context, note the returned job ID, and call research_status to stream progress without letting any single tool call stagnate.
Output size: Claude Code limits MCP tool output (about 25,000 tokens by default, configurable via MAX_MCP_OUTPUT_TOKENS). The research tools declare the anthropic/maxResultSizeChars annotation so recent Claude Code versions accept large reports without extra configuration; on older versions, raise the limit before launching the CLI:
export MAX_MCP_OUTPUT_TOKENS=50000
claude
Timeout recovery: if a research task exceeds the configured timeout, the server leaves the provider task running by default and returns its task ID; call research_status with that ID to retrieve the full report once it completes. Pass --cancel-on-timeout to deep-research-mcp (or set cancel_on_timeout = true in ~/.deep_research) to restore the old cancel behavior.
- Use in Claude Code:
- The research tools will appear in Claude Code's tool palette
- Simply ask Claude to "research [your topic]" and it will use the Deep Research agent
- For clarified research, ask Claude to "research [topic] with clarification" to get follow-up questions
OpenAI Codex Integration
- Configure MCP Server
Choose one of the transports below.
Option A: stdio (recommended when Codex should spawn the server itself)
Add the MCP server configuration to your ~/.codex/config.toml file:
[mcp_servers.deep-research]
command = "uv"
args = ["run", "--directory", "/path/to/deep-research-mcp", "deep-research-mcp"]
# If your provider credentials live in shell env vars rather than ~/.deep_research,
# pass them through to the MCP subprocess explicitly:
env_vars = ["OPENAI_API_KEY"]
startup_timeout_ms = 30000 # 30 seconds for server startup
request_timeout_ms = 7200000 # 2 hours for long-running research tasks
# Alternatively, set tool_timeout_sec when using newer Codex clients
# tool_timeout_sec = 14400.0 # 4 hours for deep research runs
Replace /path/to/deep-research-mcp/ with the actual path to your cloned repository.
If your credentials are already configured in ~/.deep_research, env_vars is optional. It is required when you expect the spawned MCP server to inherit OPENAI_API_KEY from the parent shell.
Option B: HTTP (recommended when you want to run the server separately)
Start the server in one terminal:
OPENAI_API_KEY="$OPENAI_API_KEY" \
uv run --directory /path/to/deep-research-mcp \
deep-research-mcp --transport http --host 127.0.0.1 --port 8080
Then add this to ~/.codex/config.toml:
[mcp_servers.deep-research-http]
url = "http://127.0.0.1:8080/mcp"
tool_timeout_sec = 14400.0
The verified Streamable HTTP endpoint is http://127.0.0.1:8080/mcp.
Important timeout configuration:
startup_timeout_ms: Time allowed for the MCP server to start (default: 30000ms / 30 seconds)request_timeout_ms: Maximum time for research queries to complete (recommended: 7200000ms / 2 hours for comprehensive research)tool_timeout_sec: Preferred for newer Codex clients; set this to a large value (e.g.,14400.0for 4 hours) when you expect long-running research.- Kick off research once to capture the job ID, then poll
research_statusso each tool call remains short and avoids hitting client timeouts.
Without proper timeout configuration, long-running research queries may fail with "request timed out" errors.
- Use in OpenAI Codex:
- The research tools will be available automatically when you start Codex
- Ask Codex to "research [your topic]" and it will use the Deep Research MCP server
- For clarified research, ask for "research [topic] with clarification"
Gemini CLI Integration
- Configure MCP Server
Add the MCP server using Gemini CLI's built-in command:
gemini mcp add deep-research -- uv run --directory /path/to/deep-research-mcp deep-research-mcp
Or manually add to your ~/.gemini/settings.json file:
{
"mcpServers": {
"deep-research": {
"command": "uv",
"args": ["run", "--directory", "/path/to/deep-research-mcp", "deep-research-mcp"],
"env": {
"RESEARCH_PROVIDER": "gemini",
"GEMINI_API_KEY": "$GEMINI_API_KEY"
}
}
}
}
Replace /path/to/deep-research-mcp/ with the actual path to your cloned repository.
- Use in Gemini CLI:
- Start Gemini CLI with
gemini - The research tools will be available automatically
- Ask Gemini to "research [your topic]" and it will use the Deep Research MCP server
- Use
/mcpcommand to view server status and available tools
HTTP transport: If your Gemini environment supports MCP-over-HTTP, you may run the server with --transport http and configure Gemini with the server URL.
Usage
As a Standalone Python Module
import asyncio
from deep_research_mcp.agent import DeepResearchAgent
from deep_research_mcp.config import ResearchConfig
async def main():
# Initialize configuration
config = ResearchConfig.load()
# Create agent
agent = DeepResearchAgent(config)
# Perform research
result = await agent.research(
query="What are the latest advances in quantum computing?",
system_prompt="Focus on practical applications and recent breakthroughs"
)
# Print results
print(f"Report: {result.final_report}")
print(f"Citations: {result.citations}")
print(f"Research steps: {result.reasoning_steps}")
print(f"Execution time: {result.execution_time:.2f}s")
# Run the research
asyncio.run(main())
As an MCP Server
Two transports are supported: stdio (default) and HTTP streaming.
# 1) stdio (default) — for editors/CLIs that spawn a local process
uv run deep-research-mcp
# 2) HTTP streaming — start a local HTTP MCP server
uv run deep-research-mcp --transport http --host 127.0.0.1 --port 8080
Notes:
- HTTP mode uses streaming responses provided by FastMCP. The tools in this
server return their full results when a research task completes; streaming is still beneficial for compatible clients and for future incremental outputs.
- The verified Streamable HTTP endpoint is
/mcp, so the default local URL is
http://127.0.0.1:8080/mcp.
- If you start the server outside the client and rely on environment variables
for credentials, export them before launching the server process. If you use stdio and let the client spawn the server, make sure the client passes the required env vars through.
Command-Line Interface
The unified CLI at cli/deep-research-cli.py provides direct access to all research functionality from the terminal. It supports two modes of operation: agent mode (default) which runs DeepResearchAgent directly, and MCP client mode which connects to a running MCP server over HTTP.
Configuration is loaded from ~/.deep_research by default. Every ResearchConfig parameter can be overridden via CLI flags, which take precedence over both the TOML file and environment variables.
Terminal UI
The repository also ships with a full-screen terminal UI at cli/deep-research-tui.py. It presents the same core functionality as the CLI in a dark, keyboard-driven interface for running clarification, deep research, task status checks, and saving output to disk.
The TUI features a split-panel layout:
- Left panel: Configuration controls for mode selection (Agent/MCP), provider settings, model configuration, query input, and system prompt
- Right panel: Output display showing research results, clarification questions, or status information
The animation above demonstrates the TUI workflow: selecting the Chat Completions API style, entering a research query about nuclear fusion, running the research, viewing results in the output panel, and saving the output to file.
Quick Start
# Start in direct agent mode
uv run python cli/deep-research-tui.py
# Start in MCP client mode
uv run python cli/deep-research-tui.py --mode mcp \
--server-url http://127.0.0.1:8080/mcp
# Start with Gemini selected
uv run python cli/deep-research-tui.py --provider gemini
TUI Workflow
- Use the left control panel to edit provider settings, query text, system prompt, and save path.
- The TUI starts focused on the
Modeselector rather than inside the query editor. - Use
Tab/Shift+Tabto move through all controls. - Use
Up/Downto move between non-editor controls, including single-line inputs. - Use
Left/Rightto toggle booleans and cycle through choice fields when aSwitchorSelecthas focus. - Press
Enterto activate buttons, toggle switches, cycle selects, or move forward from a single-line input. TextAreawidgets such asQueryandSystem Promptkeep normal cursor-key editing behavior.- Use
cto run clarification,rto run deep research,tto check task status,sto save the current output, andqto quit. - The right panel shows the latest clarification output, research report, or status response.
Provider Defaults
In agent mode, the TUI applies provider-aware defaults:
openai+responses: modelo4-mini-deep-research-2025-06-26, base URLhttps://api.openai.com/v1openai+chat_completions: modelgpt-5-mini, base URLhttps://api.openai.com/v1dr-tulu: modeldr-tulu, base URLhttp://localhost:8080/gemini: modeldeep-research-preview-04-2026, base URLhttps://generativelanguage.googleapis.comopen-deep-research: modelopenai/qwen/qwen3-coder-30b, base URLhttp://localhost:1234/v1
Switching provider or OpenAI API style automatically refreshes the model and base URL defaults. You can still override those fields manually afterward.
Clarification, Research, and Saving Output
- In
agentmode,Run Clarificationcalls the clarification flow directly throughDeepResearchAgent. - In
mcpmode,Run Clarificationcalls the MCPdeep_researchtool withrequest_clarification=true. - If clarification questions are returned, the TUI adds answer fields dynamically and uses those answers on the next research run.
Run Deep Researchexecutes either the direct agent flow or the MCP client flow, depending on the selected mode.Save Outputwrites the current contents of the output panel to the configured path, creating parent directories if needed.
Notes
- In
mcpmode, setMCP Server URLto a running Streamable HTTP endpoint such ashttp://127.0.0.1:8080/mcp. - The TUI reuses the same config-loading behavior as
cli/deep-research-cli.py, so~/.deep_researchand any startup overrides still apply. - Direct-agent JSON rendering is available through the
JSON Outputtoggle; MCP mode saves the textual tool response exactly as returned by the server. - The current layout is designed for terminals at least 100 columns wide and 28 rows tall.
Quick Start
Gemini Deep Research:
uv run python cli/deep-research-cli.py \
--provider gemini \
--model deep-research-preview-04-2026 \
--base-url https://generativelanguage.googleapis.com \
research "What is the capital of France?"
Expected output (snippet):
============================================================
RESEARCH REPORT
============================================================
Task ID: v1_Chd5YUxjYVpXRUotYWxrZFVQOF9lcTRRVRIX...
Total steps: 1
Execution time: 245.94s
# Comprehensive Analysis of the French Capital: Demographic,
# Economic, and Historical Dimensions of Paris
**Key Points**
* **The capital of France is Paris**, acting as the undisputed
political, economic, and cultural epicenter of the nation.
* Data indicates that the Paris metropolitan area commands a
Gross Domestic Product (GDP) exceeding $1.03 trillion ...
OpenAI o4-mini-deep-research:
uv run python cli/deep-research-cli.py \
--provider openai \
--model o4-mini-deep-research-2025-06-26 \
--base-url https://api.openai.com/v1 \
research "What is the capital of France?"
Expected output:
============================================================
RESEARCH REPORT
============================================================
Task ID: resp_0e55abdae3d3ce390069dca3c7d78c819f91...
Total steps: 22
Search queries: 10
Citations: 1
Execution time: 34.34s
The capital of France is **Paris**
(www.britannica.com/place/Paris)
============================================================
CITATIONS
============================================================
1. Paris | Definition, Map, Population, Facts, & History
https://www.britannica.com/place/Paris
DR-Tulu (requires a running dr-tulu service; see Dr Tulu provider example):
uv run python cli/deep-research-cli.py \
--provider dr-tulu \
--base-url http://localhost:8080/ \
research "What is the capital of France?"
Expected output:
============================================================
RESEARCH REPORT
============================================================
Task ID: 7f3a1b2c-...
Total steps: 4
Citations: 2
Execution time: 18.72s
The capital of France is Paris. Paris has served as the
French capital since the late 10th century ...
============================================================
CITATIONS
============================================================
1. Source 1
https://en.wikipedia.org/wiki/Paris
2. Source 2
https://www.britannica.com/place/Paris
View resolved configuration or all available options:
uv run python cli/deep-research-cli.py config --pretty
uv run python cli/deep-research-cli.py --help
Commands
research QUERY -- perform deep research on a query.
# Simple research (agent mode)
uv run python cli/deep-research-cli.py research "Economic impact of AI adoption"
# Override provider and model for a single run
uv run python cli/deep-research-cli.py --provider gemini research "Climate change policies"
# Use a custom system prompt from a file
uv run python cli/deep-research-cli.py research "Healthcare trends" --system-prompt-file prompts/health.txt
# Or pass the system prompt inline
uv run python cli/deep-research-cli.py research "Healthcare trends" --system-prompt "Focus on peer-reviewed sources only"
# Output as JSON (includes metadata, citations, execution time)
uv run python cli/deep-research-cli.py research "AI safety" --json
# Save the report to a file
uv run python cli/deep-research-cli.py research "Renewable energy" --output-file report.md
# Disable code interpreter / data analysis tools
uv run python cli/deep-research-cli.py research "Simple topic" --no-analysis
# Notify a webhook when research completes
uv run python cli/deep-research-cli.py research "Long query" --callback-url https://example.com/webhook
Tested local OpenAI-compatible backends
The unified CLI works with local servers that expose an OpenAI-compatible Chat Completions API. The commands below were tested against local Ollama and local llama-server (from llama.cpp).
Ollama
Basic research flow:
uv run python cli/deep-research-cli.py \
--provider openai \
--api-style chat_completions \
--base-url http://localhost:11434/v1 \
--api-key test \
--model qwen3.5:0.8b \
--timeout 180 \
research "Reply with exactly: ok"
Observed output:
HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
============================================================
RESEARCH REPORT
============================================================
Task ID: chatcmpl-215
Total steps: 1
Execution time: 14.33s
ok
Interactive clarification with Ollama needs the clarification models pinned to the same local endpoint. In testing, qwen3.5:4b worked for clarification, while qwen3.5:0.8b was too small to reliably satisfy the structured triage step.
uv run python cli/deep-research-cli.py \
--provider openai \
--api-style chat_completions \
--base-url http://localhost:11434/v1 \
--api-key test \
--model qwen3.5:0.8b \
--clarification-base-url http://localhost:11434/v1 \
--clarification-api-key test \
--triage-model qwen3.5:4b \
--clarifier-model qwen3.5:4b \
--instruction-builder-model qwen3.5:4b \
--timeout 180 \
research "best laptop" --clarify
Observed interaction:
Starting clarification process...
Please answer the following clarifying questions:
1. What is your budget range?
Your answer (or press Enter to skip): Under $1500
2. What will you primarily use the laptop for (gaming, work, students, creative tasks)?
Your answer (or press Enter to skip): Programming and general work
3. Do you have a preferred operating system (macOS, Windows,)?
Your answer (or press Enter to skip): macOS preferred, Windows acceptable
...
Enriched query: What are the best laptops under $1500 for professional programming work, preferably macOS with 16GB RAM, 512GB SSD, 13-15 inch display, and good battery life?
llama-server
Start a local OpenAI-compatible server and let it download a small GGUF model from Hugging Face automatically:
llama-server \
--host 127.0.0.1 \
--port 8081 \
--api-key test \
--ctx-size 4096 \
--alias qwen2.5-0.5b \
-hf Qwen/Qwen2.5-0.5B-Instruct-GGUF:Q4_K_M
Then point the CLI at the server:
uv run python cli/deep-research-cli.py \
--provider openai \
--api-style chat_completions \
--base-url http://127.0.0.1:8081/v1 \
--api-key test \
--model qwen2.5-0.5b \
--timeout 120 \
research "Reply with exactly: ok"
Observed output:
HTTP Request: POST http://127.0.0.1:8081/v1/chat/completions "HTTP/1.1 200 OK"
============================================================
RESEARCH REPORT
============================================================
Task ID: chatcmpl-uzqSNDzRgYchZRxn1Kq2MNYyc2w6hpc6
Total steps: 1
Execution time: 0.15s
Ok
--clarify is model-sensitive on llama-server. Small tested models (qwen2.5-0.5b and qwen2.5-3b) completed the basic research command but did not reliably ask follow-up questions. Example output with qwen2.5-3b:
Starting clarification process...
Triage assessment: The query is focused on finding the best laptop, which is a common and specific research topic.
Assessment: The query 'best laptop' is clear and specific enough for direct research.
Proceeding with original query
research QUERY --clarify -- interactive clarification before research.
The --clarify flag runs an interactive clarification flow: the agent analyzes your query, asks follow-up questions to improve specificity, and then performs research using an enriched query. This works in both agent mode and MCP client mode.
In agent mode, --clarify automatically enables the clarification pipeline regardless of the enable_clarification setting in your config file.
# Interactive clarification (agent mode)
uv run python cli/deep-research-cli.py research "Quantum computing applications" --clarify
# Interactive clarification (MCP client mode)
uv run python cli/deep-research-cli.py research "Quantum computing" --clarify \
--server-url http://localhost:8080/mcp
Example session:
Starting clarification process...
Please answer the following clarifying questions:
1. Are you interested in near-term applications or long-term theoretical possibilities?
Your answer (or press Enter to skip): Near-term commercial applications
2. Which industries are you most interested in?
Your answer (or press Enter to skip): Finance and pharmaceuticals
3. Should the report focus on specific hardware platforms?
Your answer (or press Enter to skip):
Enriched query: Quantum computing applications in finance and pharmaceuticals...
Starting research with query: '...'
research QUERY --server-url URL -- use MCP client mode.
Instead of running the agent directly, connect to a running Deep Research MCP server over Streamable HTTP.
# First, start the MCP server in another terminal:
uv run deep-research-mcp --transport http --host 127.0.0.1 --port 8080
# Then run queries against it:
uv run python cli/deep-research-cli.py research "AI trends" \
--server-url http://127.0.0.1:8080/mcp
status TASK_ID -- check the status of a running research task.
# Agent mode
uv run python cli/deep-research-cli.py status abc123-def456-ghi789
# MCP client mode
uv run python cli/deep-research-cli.py status abc123-def456-ghi789 \
--server-url http://127.0.0.1:8080/mcp
config -- display the resolved configuration.
Shows the final configuration after merging the TOML file, environment variables, and any CLI overrides.
# JSON output (default), with secrets masked
uv run python cli/deep-research-cli.py config
# Human-readable output
uv run python cli/deep-research-cli.py config --pretty
# Show full API keys
uv run python cli/deep-research-cli.py config --pretty --show-secrets
# See the effect of CLI overrides
uv run python cli/deep-research-cli.py --provider gemini --timeout 600 config --pretty
# Skip config validation
uv run python cli/deep-research-cli.py config --no-validate
Configuration Overrides
All global flags are placed before the subcommand and override the corresponding ResearchConfig field:
| Flag | Description | |------|-------------| | --config PATH | Path to TOML config file (default: ~/.deep_research) | | --provider {openai,dr-tulu,gemini,open-deep-research} | Research provider | | --model MODEL | Model or agent ID | | --api-key KEY | Provider API key | | --base-url URL | Provider API base URL | | --api-style {responses,chat_completions} | OpenAI API style | | --timeout SECONDS | Max research timeout | | --poll-interval SECONDS | Task poll interval | | --log-level {DEBUG,INFO,WARNING,ERROR,CRITICAL} | Logging level | | --enable-clarification | Enable the clarification pipeline | | --enable-reasoning-summaries | Enable reasoning summaries | | --triage-model MODEL | Model for query triage | | --clarifier-model MODEL | Model for query enrichment | | --clarification-base-url URL | Base URL for clarification models | | --clarification-api-key KEY | API key for clarification models | | --instruction-builder-model MODEL | Model for instruction building |
Configuration precedence (highest to lowest): CLI flags > environment variables > TOML config file (~/.deep_research) > built-in defaults.
Exit Codes
| Code | Meaning | |------|---------| | 0 | Success | | 1 | Research or configuration error | | 2 | MCP tool error | | 3 | Unexpected error |
Example Queries
# Basic research query
result = await agent.research("Explain the transformer architecture in AI")
# Research with code analysis
result = await agent.research(
query="Analyze global temperature trends over the last 50 years",
include_code_interpreter=True
)
# Custom system instructions
result = await agent.research(
query="Review the safety considerations for AGI development",
system_prompt="""
Provide a balanced analysis including:
- Technical challenges
- Current safety research
- Regulatory approaches
- Industry perspectives
Include specific examples and data where available.
"""
)
# With clarification (requires ENABLE_CLARIFICATION=true)
clarification_result = agent.start_clarification("quantum computing applications")
if clarification_result.get("needs_clarification"):
# Answer questions programmatically or present to user
answers = ["Hardware applications", "Last 5 years", "Commercial products"]
agent.add_clarification_answers(clarification_result["session_id"], answers)
enriched_query = agent.get_enriched_query(clarification_result["session_id"])
result = await agent.research(enriched_query)
Clarification Features
The agent includes an optional clarification system to improve research quality through follow-up questions.
Configuration
Enable clarification in your ~/.deep_research file: ``toml [clarification] enable_clarification = true triage_model = "gpt-5-mini" # Optional, defaults to gpt-5-mini clarifier_model = "gpt-5-mini" # Optional, defaults to gpt-5-mini instruction_builder_model = "gpt-5-mini" # Optional, defaults to gpt-5-mini clarification_api_key = "YOUR_CLARIFICATION_API_KEY" # Optional custom API key for clarification models clarification_base_url = "https://custom-api.example.com/v1" # Optional custom endpoint for clarification models ``
Clarification and instruction-building remain OpenAI-compatible chat flows. If your main research provider is dr-tulu, gemini, or open-deep-research, set clarification_api_key / clarification_base_url explicitly, or provide OPENAI_API_KEY / OPENAI_BASE_URL in the environment for those helper models.
Usage Flow
- Start Clarification:
result = agent.start_clarification("your research query")
- Check if Questions are Needed:
if result.get("needs_clarification"):
questions = result["questions"]
session_id = result["session_id"]
- Provide Answers:
answers = ["answer1", "answer2", "answer3"]
agent.add_clarification_answers(session_id, answers)
- Get Enriched Query:
enriched_query = agent.get_enriched_query(session_id)
final_result = await agent.research(enriched_query)
Integration with AI Assistants
When using with AI Assistants via MCP tools:
- Request Clarification: Use
deep_research()withrequest_clarification=True - Answer Questions: The AI Assistant will present questions to you
- Deep Research: The AI Asssitant will automatically use
research_with_context()with your answers
API Reference
DeepResearchAgent
The main class for performing research operations.
Methods
research(query, system_prompt=None, include_code_interpreter=True, callback_url=None)- Performs deep research on a query
callback_url: optional webhook notified when research completes- Returns: Dictionary with final report, citations, and metadata
get_task_status(task_id)- Check the status of a research task
- Returns: Task status information
start_clarification(query)- Analyze query and generate clarifying questions if needed
- Returns: Dictionary with questions and session ID
add_clarification_answers(session_id, answers)- Add answers to clarification questions
- Returns: Session status information
get_enriched_query(session_id)- Generate enriched query from clarification session
- Returns: Enhanced query string
ResearchConfig
Configuration class for the research agent.
Parameters
provider: Research provider (openai,dr-tulu,gemini, oropen-deep-research; default:openai)api_style: API style for theopenaiprovider (responsesorchat_completions; default:responses). Ignored fordr-tulu,gemini, andopen-deep-research.model: Model identifier- OpenAI: Responses model (e.g.,
gpt-5-mini) - Dr Tulu: logical provider id (default:
dr-tulu) - Gemini: Deep Research agent id (for example
deep-research-preview-04-2026) - Open Deep Research: LiteLLM model id (e.g.,
openai/qwen/qwen3-coder-30b) api_key: API key for the configured endpoint (optional). Defaults to envOPENAI_API_KEYforopenai,DR_TULU_API_KEYfordr-tulu,GEMINI_API_KEY/GOOGLE_API_KEYforgemini.base_url: Provider API base URL (optional). Defaults tohttps://api.openai.com/v1foropenai,http://localhost:8080/fordr-tulu,https://generativelanguage.googleapis.comforgemini, andhttp://localhost:1234/v1foropen-deep-research.timeout: Maximum time for research in seconds (default: 1800)poll_interval: Polling interval in seconds (default: 30)enable_clarification: Enable clarifying questions (default: False)triage_model: Model for query analysis (default:gpt-5-mini)clarifier_model: Model for query enrichment (default:gpt-5-mini)clarification_api_key: Custom API key for clarification models (optional; defaults to the main OpenAI credentials whenprovider=openai, otherwise falls back to envOPENAI_API_KEYif present)clarification_base_url: Custom OpenAI-compatible endpoint for clarification models (optional; defaults to the main OpenAI endpoint whenprovider=openai, otherwise falls back to envOPENAI_BASE_URLif present)
Development
Running Tests
# Install dev dependencies
uv sync --extra dev
# Run all tests
uv run pytest -v
# Run with coverage
uv run pytest --cov=deep_research_mcp tests/
# Run specific test file
uv run pytest tests/test_agents.py
Lint, Format, Type Check
uv run black .
uv run pylint src/deep_research_mcp tests
uv run mypy src/deep_research_mcp





