claude-code-codex-agents
  ![Tests]() 
Give Claude Code structured Codex traces, not raw output.
For Claude Code users who want GPT-5.4 as a real tool: claude-code-codex-agents parses the entire JSONL event stream from Codex CLI and returns a structured execution report -- which tools it used, which files it touched, how long it took, and what went wrong. No other Codex MCP bridge does this.
graph LR
A["Claude Code<br/>(Opus 4.6)"] -->|MCP Protocol| B["claude-code-codex-agents<br/>MCP Server"]
B -->|"subprocess + stdin"| C[Codex CLI]
C -->|JSONL stream| B
C -->|API call| D["OpenAI API<br/>(GPT-5.4)"]
B -->|Structured Report| A
Without vs With claude-code-codex-agents
Without -- You call Codex CLI and get a wall of text. You don't know what tools it used, what files it changed, or if it actually succeeded.
With claude-code-codex-agents -- Claude Code gets a structured execution trace:
[Codex gpt-5.4] Completed
⏱ Execution time: 8.3s
🧵 Thread: 019d436e-4c39-7093-b7ed-f8a26aca7938
📦 Tools used (3):
✅ read_file — src/auth.py
✅ edit_file — src/auth.py
✅ shell — python -m pytest tests/
📁 Files touched (1):
• src/auth.py
━━━ Codex Response ━━━
Fixed the authentication logic. Token validation order was incorrect.
Why claude-code-codex-agents?
There are 6+ Codex MCP bridges on GitHub. Here's what makes this one different:
| | Other bridges | claude-code-codex-agents | |---|---|---| | Output | Raw text dump | Structured trace (tools, files, timing, errors) | | Parallel tasks | 1 at a time | Up to 6 simultaneous | | Session continuity | Stateless | threadId persistence across calls | | Security | Pass-through | 3-tier sandbox + terminal injection prevention | | Tests | Few or none | 59 tests (parsing, security, sessions, edge cases, agent lifecycle) | | Review | Basic or none | Adversarial Review Loop (GPT-5.4 challenges Claude's code) |
Key Features
- Full JSONL Trace Parsing -- Every Codex event (tool calls, file ops, errors) parsed into a structured report
- Parallel Execution -- Run up to 6 Codex tasks simultaneously via
parallel_execute - Session Management -- Continue previous threads with
session_continue(threadId persistence) - Agent Lifecycle -- Run Codex as a background Claude Code-style worker via
spawn_codex_agent,send_codex_agent_input, andwait_codex_agent - Adversarial Review Loop -- GPT-5.4 reviews Claude's code from a different perspective
- Sandbox Security -- 3-tier policy (read-only / workspace-write / danger-full-access) + terminal injection prevention
- Cross-Model Discussion -- Get GPT-5.4's opinion on design decisions via
discuss - Zero External Dependencies -- Just FastMCP + Codex CLI. No databases, no Docker, no config files
- Japanese Native -- Full Japanese prompt and report support
- 59 Tests -- Comprehensive coverage including security, parsing, session management, agent lifecycle, and edge cases
Quick Start
1. Install Codex CLI
npm install -g @openai/codex
codex login
2. Install claude-code-codex-agents
git clone https://github.com/tsunamayo7/claude-code-codex-agents.git
cd claude-code-codex-agents
uv sync
3. Add to your MCP client
Claude Code (~/.claude/settings.json):
{
"mcpServers": {
"claude-code-codex-agents": {
"type": "stdio",
"command": "uv",
"args": ["run", "--directory", "/path/to/claude-code-codex-agents", "python", "server.py"],
"env": { "PYTHONUTF8": "1" }
}
}
}
<details> <summary><b>Cursor</b> (~/.cursor/mcp.json)</summary>
{
"mcpServers": {
"claude-code-codex-agents": {
"command": "uv",
"args": ["run", "--directory", "/path/to/claude-code-codex-agents", "python", "server.py"],
"env": { "PYTHONUTF8": "1" }
}
}
}
</details>
<details> <summary><b>VS Code / Windsurf</b></summary>
Add to your MCP settings:
{
"claude-code-codex-agents": {
"command": "uv",
"args": ["run", "--directory", "/path/to/claude-code-codex-agents", "python", "server.py"],
"env": { "PYTHONUTF8": "1" }
}
}
</details>
Tools
| Tool | Description | Sandbox | |------|-------------|---------| | execute | Delegate tasks to Codex with structured trace report | workspace-write | | trace_execute | Same as execute, plus full event timeline | workspace-write | | parallel_execute | Run up to 6 tasks simultaneously | read-only | | review | Adversarial code review by GPT-5.4 | read-only | | explain | Code explanation (brief/medium/detailed) | read-only | | generate | Code generation with optional file output | workspace-write | | discuss | Get GPT-5.4's perspective on design decisions | read-only | | session_continue | Continue a previous Codex thread | workspace-write | | session_list | List session history with thread IDs | - | | spawn_codex_agent | Launch a background Codex worker with default / explorer / worker roles | role-based | | send_codex_agent_input | Continue a background Codex worker with follow-up instructions | same as agent | | wait_codex_agent | Wait for an agent turn and fetch the last structured result | - | | list_codex_agents | Inspect tracked background Codex agents | - | | close_codex_agent | Close an idle Codex agent | - | | status | Check Codex CLI status and auth | - |
Claude Code-Style Agents
The new agent lifecycle tools let Claude Code treat Codex more like a persistent sub-agent than a one-shot CLI call.
- Use
spawn_codex_agentto start a background worker with a role preset:
default for balanced execution, explorer for read-heavy investigation, worker for implementation.
- Use
send_codex_agent_inputto continue the same worker after you read its last result. - Use
wait_codex_agentto poll for completion without blocking other work. - Use
list_codex_agentsandclose_codex_agentto manage idle workers.
Real-World Example: Adversarial Code Review
Claude Code writes code, then asks GPT-5.4 to review it:
[Codex Review] GPT-5.4 Review Result
⏱ Execution time: 15.7s
━━━ Codex Response ━━━
- [CRITICAL] `run(cmd)` calls `os.system(cmd)` directly -- command injection
if `cmd` contains user input. Use `subprocess.run([...], shell=False)`.
- [WARNING] `divide(a, b)` raises ZeroDivisionError when b == 0.
Add a pre-check or explicit error message.
- [INFO] No type hints on function signatures. Add `def divide(a: float,
b: float) -> float:` for readability.
Real-World Example: Parallel Execution
Analyze multiple tasks simultaneously:
[Parallel Execution Complete] 3 tasks
━━━ Task 1 ✅ ━━━
Instruction: Analyze src/auth.py for security issues
⏱ 5.2s
...
━━━ Task 2 ✅ ━━━
Instruction: Review database query patterns in src/db.py
⏱ 7.8s
...
━━━ Task 3 ✅ ━━━
Instruction: Check error handling in src/api.py
⏱ 4.1s
...
Architecture
sequenceDiagram
participant C as Claude Code
participant H as claude-code-codex-agents
participant X as Codex CLI
participant O as OpenAI API
C->>H: MCP tool call (execute)
H->>H: _validate() + _enforce_sandbox()
H->>X: subprocess (stdin prompt)
X->>O: API request (GPT-5.4)
O-->>X: Response
X-->>H: JSONL event stream
H->>H: parse_jsonl_events() → CodexTrace
H->>H: _sanitize() → format_report()
H-->>C: Structured report
Security Model
| Sandbox Mode | File Write | Shell Exec | Use Case | |---|---|---|---| | read-only | Blocked | Blocked | Review, explain, discuss | | workspace-write | CWD only | Allowed | Execute, generate | | danger-full-access | Anywhere | Allowed | Full system access (use with caution) |
Additional protections:
- ANSI/OSC escape sequence sanitization (terminal injection prevention)
- Input validation on all parameters
- Process kill on timeout
--ephemeralflag (no persistent Codex state)
Development
# Setup
git clone https://github.com/tsunamayo7/claude-code-codex-agents.git
cd claude-code-codex-agents
uv sync --extra dev
# Run tests (59 tests)
uv run pytest tests/ -v
# Run server directly
uv run python server.py
Project structure: Single file (server.py, ~820 lines). Easy to read, modify, and contribute.
Use Cases
- Cross-Model Code Review -- Claude writes code, GPT-5.4 reviews it. Eliminates single-model bias.
- Parallel Codebase Analysis -- Analyze 6 files simultaneously, get structured reports for each.
- Design Discussion -- Get GPT-5.4's alternative perspective on architectural decisions via
discuss. - Session-Based Refactoring -- Large refactoring across multiple
session_continuecalls with context preservation. - AI Second Opinion -- When Claude's answer seems off, ask GPT-5.4 for a sanity check.
Requirements
- Python 3.12+
- Codex CLI (
npm install -g @openai/codex) - OpenAI account (Codex CLI must be authenticated via
codex login) - uv (recommended) or pip
Related Projects
Helix Ecosystem
- helix-ai-studio — All-in-one AI chat studio with 7 providers, RAG, MCP tools, and pipeline
- helix-pilot — GUI automation MCP server — AI controls Windows desktop via local Vision LLM
- helix-agent — Extend Claude Code with local Ollama models — cut token costs by 60-80%
- helix-sandbox — Secure sandbox MCP server — Docker + Windows Sandbox
Alternative Codex Bridges
- codex-plugin-cc -- Official OpenAI plugin for Claude Code
- codex-mcp-server -- Alternative Codex MCP bridge (Node.js)






