Remote OpenClaw Blog
Best Free Models for OpenAI Codex — Plug OpenRouter Into config.toml
7 min read ·
The best free model for OpenAI Codex is openai/gpt-oss-120b:free on OpenRouter, because it is OpenAI's own open-weight model and follows the tool-calling conventions Codex's harness expects. The honest framing matters here: the Codex CLI defaults to OpenAI's paid models (the gpt-5.x-codex family), which are not free. But Codex supports custom, OpenAI-compatible providers through its config.toml, so you can point it at OpenRouter's free model tier and run it at zero model cost.
As of July 2026, OpenRouter lists 23 free models (IDs ending in :free), limited to 20 requests per minute and 50 requests per day, rising to 1,000 requests per day after a one-time $10 credit purchase. This guide ranks the free models that pair best with Codex and gives you the exact config block to wire one up.
The Honest Situation: Codex and Free Models
The OpenAI Codex CLI is built around OpenAI's paid reasoning models and ships pointed at the gpt-5.x-codex family, which bill per token or come with a paid ChatGPT plan. There is no free model baked into Codex itself. If you want zero model cost, you have to bring your own provider.
Codex makes that possible through a [model_providers.<id>] block in its configuration. According to the Codex configuration reference, you can define any OpenAI-compatible endpoint with a base_url, an env_key for the API key, and a wire_api protocol. OpenRouter is OpenAI-compatible, so it slots straight in. OpenRouter documents the exact setup in its Codex CLI integration guide.
Two constraints shape which free models are worth using. First, since February 2026 the only accepted wire_api value is "responses" — Codex removed the older chat protocol. Second, Codex's agent harness (its apply_patch tool and reasoning loop) was tuned around OpenAI models, so free models that share OpenAI's tool conventions behave more reliably than those that do not. That is why the ranking below leads with OpenAI's own open-weight release rather than a raw benchmark winner.
How to Configure Codex With OpenRouter Free Models
You configure a free model by adding an OpenRouter provider to your user-level Codex config and selecting a :free model slug. The provider and model settings only take effect in ~/.codex/config.toml — Codex ignores them in a project-local config and prints a startup warning.
# ~/.codex/config.toml
model = "openai/gpt-oss-120b:free"
model_provider = "openrouter"
[model_providers.openrouter]
name = "OpenRouter"
base_url = "https://openrouter.ai/api/v1"
env_key = "OPENROUTER_API_KEY"
wire_api = "responses"
# Provide your key via the env var named in env_key
export OPENROUTER_API_KEY="your-openrouter-key"
# Start Codex on the free model
codex
The model value has to be the exact OpenRouter slug, including the provider prefix and the :free suffix — bare model names are rejected. To switch free models, change the model line to any ID from the ranking below. For a deeper look at extending Codex with tools, see our Codex CLI MCP guide.
Best Free Models for OpenAI Codex, Ranked
These free OpenRouter models pair best with Codex's OpenAI-tuned harness, ranked for reliability inside the agent loop rather than raw benchmark scores. All IDs are verbatim from the live OpenRouter models list as of July 2026; the free roster rotates, so verify before committing.
1. GPT-OSS 120B — openai/gpt-oss-120b:free
Maker: OpenAI (open-weight, Apache-2.0). Context: 131,072 tokens. Because it comes from OpenAI, GPT-OSS 120B aligns most naturally with Codex's tool format and reasoning-effort settings, which means fewer malformed apply_patch calls than non-OpenAI models. It is the safest free default for Codex. Limitation: its 131K window is modest for very large repositories.
2. Qwen3-Coder — qwen/qwen3-coder:free
Maker: Qwen (Alibaba). Context: 1,048,576 tokens. Qwen3-Coder is the strongest free coding model on OpenRouter and the pick when your task needs to hold a whole repository in context. It generates high-quality diffs, though its tool-call formatting is less perfectly matched to Codex's harness than GPT-OSS, so expect occasional retries. Use it for large-context coding work.
3. GPT-OSS 20B — openai/gpt-oss-20b:free
Maker: OpenAI (open-weight). Context: 131,072 tokens. The lighter sibling of GPT-OSS 120B, it keeps the same OpenAI tool conventions while responding faster, which suits quick edits and iterative loops. Limitation: it is noticeably weaker on multi-step reasoning, so it is a background model rather than a primary one.
4. Nemotron 3 Super 120B — nvidia/nemotron-3-super-120b-a12b:free
Maker: NVIDIA. Context: 1,000,000 tokens. A million-token MoE with about 12B active parameters, Nemotron 3 Super is a strong long-context fallback when Qwen3-Coder is rate-limited. Limitation: as a generalist it trails coding specialists on tight code generation and can be verbose in tool responses.
5. Llama 3.3 70B Instruct — meta-llama/llama-3.3-70b-instruct:free
Maker: Meta. Context: 131,072 tokens. The most broadly supported free open model, Llama 3.3 70B behaves predictably across tools and is a dependable choice for explanation and refactoring. Limitation: it is older than the top entries and its tool calling is less crisp than GPT-OSS inside Codex.
Comparison Table
Every model below is free on OpenRouter and shares the same tier limits: 20 requests per minute and 50 requests per day, rising to 1,000 per day after a one-time $10 credit purchase, per OpenRouter's rate-limit documentation.
| Model ID (:free) | Maker | Context | Best For in Codex |
|---|---|---|---|
| openai/gpt-oss-120b:free | OpenAI | 131,072 | Primary — best tool alignment |
| qwen/qwen3-coder:free | Qwen (Alibaba) | 1,048,576 | Large-repo coding |
| openai/gpt-oss-20b:free | OpenAI | 131,072 | Fast background edits |
| nvidia/nemotron-3-super-120b-a12b:free | NVIDIA | 1,000,000 | Long-context fallback |
| meta-llama/llama-3.3-70b-instruct:free | Meta | 131,072 | Reliable all-rounder |
For a full cross-tool ranking, see our hub post on the best free OpenRouter models for AI coding agents.
Limitations and When to Pay
Running Codex on free models is workable for practice and light tasks, but the constraints are real.
- Harness mismatch: Codex was engineered around OpenAI's reasoning models. Even GPT-OSS is a smaller, open-weight model, so complex autonomous tasks see more failed tool calls and retries than the paid
gpt-5.x-codexmodels deliver. - Responses API only: Since February 2026 Codex requires
wire_api = "responses". Provider or model combinations that do not fully support the Responses shape can error or degrade. - Rate limits: 20 requests per minute and 50 requests per day (before the $10 top-up) are easy to exhaust in one agentic session, which fires many tool calls per task.
- No SLA: Free OpenRouter requests are deprioritized at peak and can queue or return 429 errors.
The sensible pattern: prototype and run low-stakes edits on openai/gpt-oss-120b:free, and switch back to a paid gpt-5.x-codex model for production work that needs frontier reliability. If your workflow spans multiple agents, our comparison of OpenCode vs Claude Code and the OpenAI Codex CLI guide help you pick the right tool.
Related Guides
- Best Free OpenRouter Models for AI Coding Agents
- OpenAI Codex CLI Guide: App, IDE, MCP, and When to Use It
- Codex CLI MCP: When MCP Actually Improves Codex Workflows
- OpenCode vs Claude Code
- OpenRouter Free Models: Best Picks + Rotation Strategy
Go deeper
The operator playbooks
Production-ready PDF guides for OpenClaw and Hermes Agent — $19.99 each.
Skills for this topic
Browse all skills →Frequently Asked Questions
Can OpenAI Codex use free models?
Yes, indirectly. Codex defaults to paid OpenAI models but supports any OpenAI-compatible provider through a [model_providers.<id>] block in ~/.codex/config.toml . Point that provider at OpenRouter with a base_url of https://openrouter.ai/api/v1 and set model to a free slug like openai/gpt-oss-120b:free to run Codex at zero model cost.
What is the best free model for OpenAI Codex?
The best free pick is openai/gpt-oss-120b:free , OpenAI's own open-weight model with a 131,072-token context window, because it aligns with Codex's tool-calling format. If you need to work across large repositories, qwen/qwen3-coder:free offers a 1,048,576-token context, though its tool formatting is a slightly weaker match for Codex.
Why does my Codex custom provider not work?
The two most common causes are placing the config in a project-local .codex/config.toml instead of the user-level ~/.codex/config.toml , which Codex ignores, and using an unsupported wire_api value. Since February 2026 the only accepted value is "responses" , per the Codex configuration reference . The model slug must also be exact, including the provider prefix and :free suffix.
What are the OpenRouter free model rate limits?
OpenRouter free models are limited to 20 requests per minute and 50 requests per day if you have purchased less than $10 in credits, according to OpenRouter's rate-limit docs . A one-time $10 credit purchase raises the daily limit to 1,000 requests permanently, while the per-minute ceiling stays at 20.
Do free models work as well as gpt-5-codex in Codex?
No. Codex's harness was tuned around OpenAI's paid reasoning models, so free open-weight models produce more failed tool calls, extra retries, and weaker results on complex multi-file tasks. Free models are good for learning and light edits, but production coding still favors the paid gpt-5.x-codex family.





