UltraThink Is Deprecated (And What Replaced It)

The keyword era ends when “thinking” becomes a normal runtime setting. Here’s the practical model: defaults, overrides, and when to dial it up or down.

extended-thinkingperformancecoding-agents

ultrathink used to be a cheat code: sprinkle a keyword into the prompt, get a bigger reasoning budget.

In newer Claude Code bundles, that approach is going away. “Thinking” is treated like a normal runtime setting instead of a magic word.

This post is about the practical replacement: what knobs exist, what they do, and how to use them without turning your CLI into sludge.

What You’ll Learn


Why the Keyword Era Ends

Magic keywords are a UI hack. They work until:

A serious agent needs explicit settings:


The Replacement Model

In practice, “extended thinking” is just two decisions made by the runtime:

  1. Should thinking be enabled for this request?
  2. If yes, what’s the maximum budget?

Those decisions are typically driven by:

The important part is not the exact defaults. It’s that the decision is explicit and repeatable.


How to Control It (Practical)

There are two controls you want.

1. Cap thinking budget

Recent bundles expose an environment-based cap. Conceptually:

# Example: cap thinking budget for one run
MAX_THINKING_TOKENS=20000 claude

If you set it to 0, you effectively disable extended thinking for that run.

2. Turn “always thinking” on/off

There’s also typically a settings-level default (conceptually):

{
  "alwaysThinkingEnabled": false
}

Names and exact locations can change. The key idea is that this should live in settings, not in your prompt text.


When to Dial It Up (And When Not To)

Extra thinking is worth it when the cost of a mistake is high:

It’s usually wasted on:

The trade-offs are predictable:

So the best default is: use thinking, but keep it capped.


← Back to all posts