Remote OpenClaw Blog

Best Free Models for OpenAI Codex — Plug OpenRouter Into config.toml

7 min read · 20 October 2018

The best free model for OpenAI Codex is openai/gpt-oss-120b:free on OpenRouter, because it is OpenAI's own open-weight model and follows the tool-calling conventions Codex's harness expects. The honest framing matters here: the Codex CLI defaults to OpenAI's paid models (the gpt-5.x-codex family), which are not free. But Codex supports custom, OpenAI-compatible providers through its config.toml, so you can point it at OpenRouter's free model tier and run it at zero model cost.

As of July 2026, OpenRouter lists 23 free models (IDs ending in :free), limited to 20 requests per minute and 50 requests per day, rising to 1,000 requests per day after a one-time $10 credit purchase. This guide ranks the free models that pair best with Codex and gives you the exact config block to wire one up.

The Honest Situation: Codex and Free Models

The OpenAI Codex CLI is built around OpenAI's paid reasoning models and ships pointed at the gpt-5.x-codex family, which bill per token or come with a paid ChatGPT plan. There is no free model baked into Codex itself. If you want zero model cost, you have to bring your own provider.

Codex makes that possible through a [model_providers.<id>] block in its configuration. According to the Codex configuration reference, you can define any OpenAI-compatible endpoint with a base_url, an env_key for the API key, and a wire_api protocol. OpenRouter is OpenAI-compatible, so it slots straight in. OpenRouter documents the exact setup in its Codex CLI integration guide.

Two constraints shape which free models are worth using. First, since February 2026 the only accepted wire_api value is "responses" — Codex removed the older chat protocol. Second, Codex's agent harness (its apply_patch tool and reasoning loop) was tuned around OpenAI models, so free models that share OpenAI's tool conventions behave more reliably than those that do not. That is why the ranking below leads with OpenAI's own open-weight release rather than a raw benchmark winner.

How to Configure Codex With OpenRouter Free Models

You configure a free model by adding an OpenRouter provider to your user-level Codex config and selecting a :free model slug. The provider and model settings only take effect in ~/.codex/config.toml — Codex ignores them in a project-local config and prints a startup warning.

# ~/.codex/config.toml
model = "openai/gpt-oss-120b:free"
model_provider = "openrouter"

[model_providers.openrouter]
name = "OpenRouter"
base_url = "https://openrouter.ai/api/v1"
env_key = "OPENROUTER_API_KEY"
wire_api = "responses"

# Provide your key via the env var named in env_key
export OPENROUTER_API_KEY="your-openrouter-key"

# Start Codex on the free model
codex

The model value has to be the exact OpenRouter slug, including the provider prefix and the :free suffix — bare model names are rejected. To switch free models, change the model line to any ID from the ranking below. For a deeper look at extending Codex with tools, see our Codex CLI MCP guide.

Best Free Models for OpenAI Codex, Ranked

These free OpenRouter models pair best with Codex's OpenAI-tuned harness, ranked for reliability inside the agent loop rather than raw benchmark scores. All IDs are verbatim from the live OpenRouter models list as of July 2026; the free roster rotates, so verify before committing.

1. GPT-OSS 120B — `openai/gpt-oss-120b:free`

Maker: OpenAI (open-weight, Apache-2.0). Context: 131,072 tokens. Because it comes from OpenAI, GPT-OSS 120B aligns most naturally with Codex's tool format and reasoning-effort settings, which means fewer malformed apply_patch calls than non-OpenAI models. It is the safest free default for Codex. Limitation: its 131K window is modest for very large repositories.

2. Qwen3-Coder — `qwen/qwen3-coder:free`

Maker: Qwen (Alibaba). Context: 1,048,576 tokens. Qwen3-Coder is the strongest free coding model on OpenRouter and the pick when your task needs to hold a whole repository in context. It generates high-quality diffs, though its tool-call formatting is less perfectly matched to Codex's harness than GPT-OSS, so expect occasional retries. Use it for large-context coding work.

3. GPT-OSS 20B — `openai/gpt-oss-20b:free`

Maker: OpenAI (open-weight). Context: 131,072 tokens. The lighter sibling of GPT-OSS 120B, it keeps the same OpenAI tool conventions while responding faster, which suits quick edits and iterative loops. Limitation: it is noticeably weaker on multi-step reasoning, so it is a background model rather than a primary one.

4. Nemotron 3 Super 120B — `nvidia/nemotron-3-super-120b-a12b:free`

Maker: NVIDIA. Context: 1,000,000 tokens. A million-token MoE with about 12B active parameters, Nemotron 3 Super is a strong long-context fallback when Qwen3-Coder is rate-limited. Limitation: as a generalist it trails coding specialists on tight code generation and can be verbose in tool responses.

5. Llama 3.3 70B Instruct — `meta-llama/llama-3.3-70b-instruct:free`

Maker: Meta. Context: 131,072 tokens. The most broadly supported free open model, Llama 3.3 70B behaves predictably across tools and is a dependable choice for explanation and refactoring. Limitation: it is older than the top entries and its tool calling is less crisp than GPT-OSS inside Codex.

Comparison Table

Every model below is free on OpenRouter and shares the same tier limits: 20 requests per minute and 50 requests per day, rising to 1,000 per day after a one-time $10 credit purchase, per OpenRouter's rate-limit documentation.

Model ID (:free)	Maker	Context	Best For in Codex
openai/gpt-oss-120b:free	OpenAI	131,072	Primary — best tool alignment
qwen/qwen3-coder:free	Qwen (Alibaba)	1,048,576	Large-repo coding
openai/gpt-oss-20b:free	OpenAI	131,072	Fast background edits
nvidia/nemotron-3-super-120b-a12b:free	NVIDIA	1,000,000	Long-context fallback
meta-llama/llama-3.3-70b-instruct:free	Meta	131,072	Reliable all-rounder

For a full cross-tool ranking, see our hub post on the best free OpenRouter models for AI coding agents.

Limitations and When to Pay

Running Codex on free models is workable for practice and light tasks, but the constraints are real.

Harness mismatch: Codex was engineered around OpenAI's reasoning models. Even GPT-OSS is a smaller, open-weight model, so complex autonomous tasks see more failed tool calls and retries than the paid gpt-5.x-codex models deliver.
Responses API only: Since February 2026 Codex requires wire_api = "responses". Provider or model combinations that do not fully support the Responses shape can error or degrade.
Rate limits: 20 requests per minute and 50 requests per day (before the $10 top-up) are easy to exhaust in one agentic session, which fires many tool calls per task.
No SLA: Free OpenRouter requests are deprioritized at peak and can queue or return 429 errors.

The sensible pattern: prototype and run low-stakes edits on openai/gpt-oss-120b:free, and switch back to a paid gpt-5.x-codex model for production work that needs frontier reliability. If your workflow spans multiple agents, our comparison of OpenCode vs Claude Code and the OpenAI Codex CLI guide help you pick the right tool.

Related Guides

Go deeper

The operator playbooks

Production-ready PDF guides for OpenClaw and Hermes Agent — $19.99 each.

The OpenClaw Operator Guide →

The Hermes Agent Playbook →

Skills for this topic

Browse all skills →

babysit-propenai/codex2K installs test-tuiopenai/codex2K installs openai-docs-skillam-will/codex-skills1K installs plugin-creatoropenai/skills1K installs migrate-to-codexopenai/skills962 installs codex-plugin-ccaradotso/trending-skills768 installs

Frequently Asked Questions

Can OpenAI Codex use free models?

Yes, indirectly. Codex defaults to paid OpenAI models but supports any OpenAI-compatible provider through a [model_providers.<id>] block in ~/.codex/config.toml . Point that provider at OpenRouter with a base_url of https://openrouter.ai/api/v1 and set model to a free slug like openai/gpt-oss-120b:free to run Codex at zero model cost.

What is the best free model for OpenAI Codex?

The best free pick is openai/gpt-oss-120b:free , OpenAI's own open-weight model with a 131,072-token context window, because it aligns with Codex's tool-calling format. If you need to work across large repositories, qwen/qwen3-coder:free offers a 1,048,576-token context, though its tool formatting is a slightly weaker match for Codex.

Why does my Codex custom provider not work?

The two most common causes are placing the config in a project-local .codex/config.toml instead of the user-level ~/.codex/config.toml , which Codex ignores, and using an unsupported wire_api value. Since February 2026 the only accepted value is "responses" , per the Codex configuration reference . The model slug must also be exact, including the provider prefix and :free suffix.

What are the OpenRouter free model rate limits?

OpenRouter free models are limited to 20 requests per minute and 50 requests per day if you have purchased less than $10 in credits, according to OpenRouter's rate-limit docs . A one-time $10 credit purchase raises the daily limit to 1,000 requests permanently, while the per-minute ceiling stays at 20.

Do free models work as well as gpt-5-codex in Codex?

No. Codex's harness was tuned around OpenAI's paid reasoning models, so free open-weight models produce more failed tool calls, extra retries, and weaker results on complex multi-file tasks. Free models are good for learning and light edits, but production coding still favors the paid gpt-5.x-codex family.

Loading article