Remote OpenClaw Blog
Best Free Models for ZeroClaw: Zero-Cost Agent Runtime
8 min read ·
The best free model for ZeroClaw is GPT-OSS 120B (openai/gpt-oss-120b:free) through OpenRouter's free tier, which delivers the strongest zero-cost reasoning available, has a 131K-token context window, and needs no credit card. Because ZeroClaw is a sub-5 MB Rust runtime that offloads inference to a provider, a free cloud model is the natural fit, and swapping to it is a one-line change in ZeroClaw's single TOML config.
ZeroClaw is a fast, minimal AI agent runtime shipped as a single Rust binary, configured entirely through ~/.zeroclaw/config.toml. Its provider catalog spans OpenAI, Anthropic, Google Gemini, xAI, Mistral, DeepSeek, OpenRouter with 100+ models, and any OpenAI-compatible endpoint including Ollama for local models, which is exactly why free models plug in so easily.
How ZeroClaw Selects Models
ZeroClaw stores all configuration in a single TOML file and picks a model through a [providers.models.default] block that names a provider, a model ID, and an API key. According to the ZeroClaw GitHub repository, a reused OpenAI-compatible adapter powers roughly 20 providers, so any free OpenRouter model or local Ollama model works through the same schema.
This matters for free models in two ways. First, the runtime itself uses under 5 MB of RAM and starts in milliseconds, so it runs on a cheap VPS or a Raspberry Pi while the actual inference happens on OpenRouter's free tier. Second, provider switching is a one-line config change, so you can move between a free cloud model and a local Ollama model without touching your agent logic. For the fuller runtime picture, see our OpenClaw vs ZeroClaw comparison and the ZeroClaw overview.
Best Free Models for ZeroClaw, Ranked
The ranking below favors models that are genuinely free, handle agent tool calling, and either route through OpenRouter's free tier or run locally via Ollama. All model IDs and context windows were verified against OpenRouter's free model collection as of July 2026.
1. GPT-OSS 120B (OpenAI)
GPT-OSS 120B is the best free model for ZeroClaw because the lightweight runtime is built to offload inference, and this model gives you the strongest free reasoning available. It is a mixture-of-experts model with a 131K-token context, available as openai/gpt-oss-120b:free.
- Maker: OpenAI · Context: 131K · License: Apache 2.0
- How to use with ZeroClaw: Set
provider = "openrouter"andmodel = "openai/gpt-oss-120b:free"in your default provider block. - Limitation: Bound by the free tier's 20 requests per minute, so heavy or bursty agents will hit the ceiling.
2. GPT-OSS 20B (OpenAI)
GPT-OSS 20B is the best free model for a local ZeroClaw setup, since it runs on roughly 16 GB of memory through Ollama while staying Apache 2.0 and strong at coding. It shares the 131K-token context and is available as openai/gpt-oss-20b:free on OpenRouter or gpt-oss:20b in Ollama.
- Maker: OpenAI · Context: 131K · License: Apache 2.0
- How to use with ZeroClaw: Point a provider block at a local Ollama endpoint, or use the OpenRouter free ID for a no-hardware option.
- Limitation: A local model needs a separate machine with real RAM, which offsets ZeroClaw's tiny footprint.
3. Qwen3 Coder (Alibaba Qwen)
Qwen3 Coder is the best free coding model for ZeroClaw, with a 1M-token context and state-of-the-art open code generation. It is available as qwen/qwen3-coder:free on OpenRouter and qwen3-coder in Ollama.
- Maker: Alibaba Qwen · Context: 1M · License: Apache 2.0
- How to use with ZeroClaw: Assign it to a coding agent block for large-repository refactors and long-context tasks.
- Limitation: Its free OpenRouter availability rotates and has dropped out for periods, so keep a fallback in the chain.
4. Llama 3.3 70B Instruct (Meta)
Llama 3.3 70B is a reliable free generalist for ZeroClaw with mature tool-calling behavior and wide support. It has a 131K-token context and is offered as meta-llama/llama-3.3-70b-instruct:free or llama3.3:70b in Ollama.
- Maker: Meta · Context: 131K · License: Llama 3.3 Community License
- How to use with ZeroClaw: Best via OpenRouter free unless you have roughly 40 GB of memory for the local build.
- Limitation: The 70B local build is heavy, and free routing is deprioritized during peak load.
5. Nemotron 3 Super 120B (NVIDIA)
Nemotron 3 Super 120B is a strong free choice for ZeroClaw agents that need a very large context, with a 1M-token window and NVIDIA's reasoning tuning. It is available as nvidia/nemotron-3-super-120b-a12b:free.
- Maker: NVIDIA · Context: 1M · License: NVIDIA Open Model License
- How to use with ZeroClaw: Route through OpenRouter free for long-context research and synthesis agents.
- Limitation: Large mixture-of-experts models can queue longer on the free tier during busy periods.
Free Model Comparison Table
The table below compares the top free ZeroClaw models on maker, context window, best use, and how each connects to the runtime.
| Model | Maker | Context | Best For | ZeroClaw Path |
|---|---|---|---|---|
| GPT-OSS 120B | OpenAI | 131K | Strongest free reasoning | OpenRouter free |
| GPT-OSS 20B | OpenAI | 131K | Local free default | Ollama or OpenRouter free |
| Qwen3 Coder | Alibaba Qwen | 1M | Coding, huge context | OpenRouter free or Ollama |
| Llama 3.3 70B | Meta | 131K | General-purpose, tool calling | OpenRouter free |
| Nemotron 3 Super 120B | NVIDIA | 1M | Long-context research | OpenRouter free |
For a full zero-cost model menu that applies across agent stacks, see our OpenRouter free models guide and the wider best free models for OpenClaw rundown.
How to Configure a Free Model in ZeroClaw
ZeroClaw selects a model through its TOML config, and switching to a free model is a one-line change. The example below sets GPT-OSS 120B on OpenRouter as the default and falls back to a local Ollama model when the free tier throttles.
# ~/.zeroclaw/config.toml
[providers.models.default]
provider = "openrouter"
model = "openai/gpt-oss-120b:free"
api_key = "${OPENROUTER_API_KEY}"
[providers.models.ollama-local]
provider = "ollama"
model = "gpt-oss:20b"
base_url = "http://localhost:11434"
[providers.fallback]
chain = ["default", "ollama-local"]
ZeroClaw also ships an interactive setup wizard, documented on the ZeroClaw install page, that walks through provider selection and API key entry, storing keys locally in the config file and sending them only to the AI provider. If you prefer that route, run the onboarding flow and pick OpenRouter, then paste your key.
# Provide your OpenRouter key to the environment referenced above
export OPENROUTER_API_KEY="your-openrouter-key"
# Optional local fallback: install Ollama and pull a free model
curl -fsSL https://ollama.com/install.sh | sh
ollama pull gpt-oss:20b
The local fallback uses Ollama, which serves any OpenAI-compatible model with no API cost and no rate limits once you have the hardware to run it.
Limitations and Tradeoffs
Free models keep ZeroClaw genuinely zero-cost, but there are tradeoffs. The OpenRouter free tier is capped at roughly 20 requests per minute and 200 requests per day per model, and free requests are deprioritized during peak traffic. That is comfortable for a single edge agent but too tight for high-throughput or multi-instance ZeroClaw fleets, which is one of ZeroClaw's core strengths.
Running a free model locally through Ollama removes the rate limit, but it reintroduces the RAM cost that ZeroClaw's tiny binary was designed to avoid. A local 20B model needs about 16 GB of memory on a separate host, so the lightweight runtime advantage only fully holds when you offload to a free cloud model. Free model availability also rotates: DeepSeek's free variants lost their free status on OpenRouter in mid-2026, so treat any free ID as subject to change and keep a fallback in the chain.
When not to use free models with ZeroClaw: high-volume edge fleets, unattended production automation, or latency-sensitive workloads where queueing on the free tier is unacceptable. For those, a cheap paid model removes the cap for a few dollars a month. See best cheap models for OpenClaw for the next step up.
Related Guides
- OpenClaw vs ZeroClaw: Full Comparison
- ZeroClaw Overview
- OpenRouter Free Models Guide
- Best Free Models for OpenClaw
Go deeper
The operator playbooks
Production-ready PDF guides for OpenClaw and Hermes Agent — $19.99 each.
Skills for this topic
Browse all skills →Frequently Asked Questions
What is the best free model for ZeroClaw?
The best free model for ZeroClaw is GPT-OSS 120B ( openai/gpt-oss-120b:free ) on OpenRouter's free tier. It gives the strongest free reasoning available, has a 131K-token context, and suits ZeroClaw's design of offloading inference from a sub-5 MB runtime. For a local option, GPT-OSS 20B via Ollama is the best free pick.
How do I set a free model in ZeroClaw's config?
Edit ~/.zeroclaw/config.toml and set a [providers.models.default] block with provider = "openrouter" , model = "openai/gpt-oss-120b:free" , and your key via api_key = "${OPENROUTER_API_KEY}" . Switching providers is a one-line change, and you can add a fallback chain to a local Ollama model for overflow.
Can ZeroClaw run free models locally?
Yes. ZeroClaw's provider catalog includes Ollama and any OpenAI-compatible endpoint, so you can pull a free local model such as gpt-oss:20b and point a provider block at http://localhost:11434 . Local inference has no rate limits, but it needs a machine with enough RAM, which offsets ZeroClaw's tiny footprint.
What are the free tier limits for ZeroClaw on OpenRouter?
OpenRouter's free tier allows roughly 20 requests per minute and about 200 requests per day per model, with no credit card required. Free requests are deprioritized during peak traffic. For a single ZeroClaw agent this is usually enough, but high-throughput edge fleets will need a paid model or a local fallback.
Which free model is best for coding agents on ZeroClaw?
Qwen3 Coder ( qwen/qwen3-coder:free ) is the best free coding model for ZeroClaw when its free slot is live, thanks to a 1M-token context and strong open code generation. GPT-OSS 120B is the more consistently available choice for general coding and agentic tasks.





