Remote OpenClaw Blog

Ultrathink in Claude Code: What It Actually Does in 2026

Q: Does ultrathink still work in Claude Code?

Not as a documented control. As of July 2026 the keyword does not appear in the official Claude Code docs or changelog, and the fixed 31,999-token budget it once triggered is gone. Thinking is adaptive on current models, and the supported ways to change reasoning depth are /effort , claude --thinking disabled , and MAX_THINKING_TOKENS .

Q: How many tokens does ultrathink use?

Historically, ultrathink allocated a 31,999-token thinking budget, versus 10,000 for "think hard"/megathink and 4,000 for "think", per Simon Willison's April 2025 analysis of the Claude Code source. On current models there is no fixed number: adaptive thinking spends what the task needs, scaled by the effort level, and the budget was always a ceiling rather than a guaranteed spend.

Q: What replaced ultrathink in Claude Code?

Effort settings. The /effort command sets reasoning effort (Opus 4.8 defaults to high, with xhigh for the hardest tasks), which maps to the API's output_config.effort parameter with levels from low to max. A separate keyword, "ultracode", triggers Claude Code's dynamic multi-agent workflow mode and is unrelated to the old thinking budgets.

Q: How do I turn off extended thinking in Claude Code?

Three documented options as of v2.1.166: start a session with claude --thinking disabled , set the environment variable MAX_THINKING_TOKENS=0 for a persistent default, or use the per-model thinking toggle in the model picker. Disabling thinking speeds up simple tasks but typically hurts quality on complex, multi-step work.

Q: Do I pay for Claude's thinking tokens?

Yes. Thinking tokens are billed as output tokens at the model's normal output rate, and Anthropic's docs are explicit that you are billed for the full thinking process even when the visible output is a summary or omitted entirely. On subscription plans, heavy thinking consumes your usage limits faster rather than showing up as a line item.

8 min read · 20 October 2018

Ultrathink is a prompt keyword that, in 2025 versions of Claude Code, triggered the maximum extended thinking budget of 31,999 tokens. As of July 2026 the keyword no longer appears in the official Claude Code documentation or changelog: thinking is now adaptive on current Claude models, and reasoning depth is controlled with the /effort command (low through xhigh), the --thinking CLI flag, and the MAX_THINKING_TOKENS environment variable.

Key Takeaways

In April 2025, Anthropic documented an escalating set of thinking keywords: "think" < "think hard" < "think harder" < "ultrathink", each mapping a larger thinking budget.
Simon Willison found the exact budgets in Claude Code's minified source: think = 4,000 tokens, megathink = 10,000, ultrathink = 31,999.
Fixed thinking budgets are deprecated at the API level: current Claude models use adaptive thinking, where the model decides how much to think per request.
The modern levers are /effort (Claude Code defaults Opus 4.8 to high; /effort xhigh for the hardest tasks), claude --thinking disabled, MAX_THINKING_TOKENS=0, and a per-model thinking toggle.
Thinking tokens are billed as output tokens, and you pay for the full thinking process even when the display is summarized or omitted.
Typing "ultrathink" today is harmless, but it is no longer a documented control; use effort settings for predictable behavior.

What Is Ultrathink?

Ultrathink is the strongest of the "magic words" that early Claude Code versions scanned for in your prompt to decide how much extended thinking budget to allocate before acting. Extended thinking is the model's internal reasoning phase: gray scratchpad text where Claude works through a problem before writing the visible answer or making tool calls.

The keywords became famous through Anthropic's April 2025 engineering post, Claude Code: Best practices for agentic coding, which recommended asking Claude to plan first and noted that "think" < "think hard" < "think harder" < "ultrathink" each mapped to increasing thinking budgets. The word stuck because it worked and because it was fun to type. What most people asking about ultrathink in 2026 actually want is the current, supported way to make Claude Code reason harder, which is what the rest of this guide covers.

The 2025 Thinking Keywords and Their Budgets

The original keyword-to-budget mapping was hardcoded, and the exact numbers were reverse-engineered from Claude Code's minified JavaScript by Simon Willison on April 19, 2025. His analysis found three tiers:

Keyword (2025)	Thinking budget	Also triggered by
think	4,000 tokens	-
megathink	10,000 tokens	"think hard", "think deeply", "think a lot", "think more"
ultrathink	31,999 tokens	"think harder", "think intensely", "think really hard", "think super hard"

Two details matter for interpreting this history. First, the mapping was a Claude Code implementation detail, not a model feature: the CLI translated your keyword into the API's budget_tokens parameter. Second, the budget was a ceiling, not a command; the model could stop thinking early. Both details explain why the mechanism aged out once the underlying API changed.

What Ultrathink Does in Claude Code Today

As of July 2026, the string "ultrathink" does not appear in the official Claude Code documentation at code.claude.com or in the project's public changelog, and the fixed 31,999-token budget it once set is gone. The reason is an API-level shift: on current Claude models, manual thinking budgets (budget_tokens) are deprecated or rejected outright in favor of adaptive thinking, where the model itself decides when and how much to think, scaled by an effort setting.

In practice that means thinking is on by default in Claude Code sessions with supported models, without any keyword. Typing "ultrathink" in a prompt today is harmless, and because instruction-following models respond to emphasis, phrases like "think hard about the edge cases" still legitimately nudge the model to reason more. But there is no documented guarantee behind the keyword anymore. Some community writeups report that recent builds map "ultrathink" to a higher effort level for the next turn; we could not verify that in the official changelog, so treat it as unconfirmed and use the documented controls below when the behavior matters.

How to Control Thinking in 2026

Claude Code now exposes reasoning depth through effort levels and thinking toggles rather than prompt keywords. These are the documented controls as of July 2026:

Control	How to use it	What it does
`/effort`	Run `/effort` in a session and pick a level	Sets reasoning effort. Per the changelog, Opus 4.8 "defaults to high effort" and `/effort xhigh` is recommended "for your hardest tasks". Your choice can persist as the default for new sessions.
`--thinking` flag	`claude --thinking disabled`	Disables extended thinking for that session (added in v2.1.166 alongside the other toggles).
`MAX_THINKING_TOKENS`	`MAX_THINKING_TOKENS=0` in your environment or settings `env`	Setting it to 0 disables thinking persistently; historically it also set a manual budget on models that accepted one.
Per-model toggle	Model picker UI	Turns thinking on or off for a specific model selection.
ultracode keyword	Include "ultracode" in a prompt	A separate feature: triggers Claude Code's dynamic multi-agent workflow mode, pairing top-end effort with orchestration. Renamed from "workflow" in v2.1.154.

At the API level, the same spectrum is exposed as output_config.effort with values low, medium, high, xhigh, and max, documented in the effort parameter reference. Effort controls more than thinking depth: lower levels also produce fewer, more consolidated tool calls and terser output. You can watch the current state live, since the Claude Code statusline JSON exposes both effort.level and thinking.enabled.

Token and Cost Implications

Thinking tokens are billed as output tokens at the model's standard output rate, and per Anthropic's extended thinking documentation you are charged for the full thinking process even when the interface shows only a summary or nothing at all. Hiding thinking reduces latency and clutter, not cost.

The arithmetic is why "always ultrathink" was never a free lunch. At Claude Opus 4.8's output rate of $25 per million tokens (as of mid-2026), a prompt that actually consumed the old 31,999-token ultrathink ceiling could add roughly $0.80 in thinking alone, before a single line of visible output. Multiply that across a long agentic session and effort settings become a real budget lever. On subscription plans the cost shows up differently, as faster consumption of your 5-hour and weekly usage limits. The practical guidance: run at the default effort for routine work, reach for xhigh on genuinely hard problems, and drop to low or medium for mechanical edits. Our Claude model comparison covers which model tiers support which effort levels.

When Deeper Thinking Actually Helps

Extended thinking pays off on tasks with real search depth: architectural decisions, debugging with several plausible causes, multi-step migrations, and planning before large edits. Anthropic's own best-practices guidance has been consistent since 2025 on the pattern that works: ask Claude to research and plan first, with thinking emphasized, then execute the plan as a separate step. That workflow survives intact in 2026; only the trigger changed from a keyword to an effort setting.

Deeper thinking is wasted on lookups, renames, boilerplate, and anything where the first answer is almost always right. It can even hurt: more deliberation means more latency, and on trivial tasks high effort sometimes produces overthought, overbuilt solutions. If you want structured multi-step reasoning as an inspectable artifact rather than a hidden scratchpad, a tool like the Sequential Thinking MCP server is a complementary approach, and our Claude Code best practices guide covers when to plan versus just act.

Limitations

The honest caveat about ultrathink in 2026 is that keyword-triggered behavior is undocumented behavior. Claude Code has changed thinking mechanics several times (fixed budgets, then defaults, then adaptive thinking and effort), and anything you learned from a 2025 blog post, including the 4,000/10,000/31,999 numbers above, describes a version you are probably not running. Verify against the current docs before building habits or scripts around a keyword.

Also remember that thinking quality is bounded by context quality. No effort level rescues a prompt that is missing the constraint that matters, and a bloated context window degrades reasoning faster than a low effort setting does. Fix your CLAUDE.md and your prompt first; turn the effort dial second.

Related Guides

Go deeper

The operator playbooks

Production-ready PDF guides for OpenClaw and Hermes Agent — $19.99 each.

The OpenClaw Operator Guide →

The Hermes Agent Playbook →

Skills for this topic

Browse all skills →

running-claude-code-via-litellm-copilotxixu-me/skills260K installs git-guardrails-claude-codemattpocock/skills93K installs frontend-designanthropics/claude-code49K installs agent developmentanthropics/claude-code15K installs skill developmentanthropics/claude-code15K installs mcp integrationanthropics/claude-code12K installs

Frequently Asked Questions

Does ultrathink still work in Claude Code?

Not as a documented control. As of July 2026 the keyword does not appear in the official Claude Code docs or changelog, and the fixed 31,999-token budget it once triggered is gone. Thinking is adaptive on current models, and the supported ways to change reasoning depth are /effort , claude --thinking disabled , and MAX_THINKING_TOKENS .

How many tokens does ultrathink use?

Historically, ultrathink allocated a 31,999-token thinking budget, versus 10,000 for "think hard"/megathink and 4,000 for "think", per Simon Willison's April 2025 analysis of the Claude Code source. On current models there is no fixed number: adaptive thinking spends what the task needs, scaled by the effort level, and the budget was always a ceiling rather than a guaranteed spend.

What replaced ultrathink in Claude Code?

Effort settings. The /effort command sets reasoning effort (Opus 4.8 defaults to high, with xhigh for the hardest tasks), which maps to the API's output_config.effort parameter with levels from low to max. A separate keyword, "ultracode", triggers Claude Code's dynamic multi-agent workflow mode and is unrelated to the old thinking budgets.

How do I turn off extended thinking in Claude Code?

Three documented options as of v2.1.166: start a session with claude --thinking disabled , set the environment variable MAX_THINKING_TOKENS=0 for a persistent default, or use the per-model thinking toggle in the model picker. Disabling thinking speeds up simple tasks but typically hurts quality on complex, multi-step work.

Do I pay for Claude's thinking tokens?

Yes. Thinking tokens are billed as output tokens at the model's normal output rate, and Anthropic's docs are explicit that you are billed for the full thinking process even when the visible output is a summary or omitted entirely. On subscription plans, heavy thinking consumes your usage limits faster rather than showing up as a line item.

Loading article