Inside Claude Code's Compaction System

How Claude Code manages long-running context — reverse engineered from a shipped bundle.

compactioncontext-managementcontext-window

TL;DR: Compaction is not “summarize and hope.” Claude Code treats it as an operational mechanism: it sheds bulky tool outputs early, and when the context is close to full it summarizes into a structured “working state,” then rehydrates the few things that make you productive (recent files, todos, and a continuation instruction).

This post is based on reverse engineering a Claude Code bundle and focusing on the parts you can actually reimplement.

What You’ll Learn


The Context Window

The context window is everything the model can see at once: system prompts, your messages, assistant messages, tool outputs, and any injected file contents.


What is Compaction?

Compaction is summarization plus context restoration. After summarizing, Claude Code re-reads your recent files, restores your task list, and tells Claude to pick up where it left off.

ApproachWhat it does
TruncationCut old messages. Simple but lossy.
SummarizationCondense conversation into summary. Preserves meaning but loses detail.
CompactionSummarize + restore recent files + preserve todos + inject continuation instructions.

Before and after compaction: context window visualization


How It Works

Claude Code manages context through three user-facing mechanisms:

  1. Microcompaction — offload bulky tool results early
  2. Auto-compaction — compact when the session is close to full
  3. Manual compaction — compact at a task boundary when you decide it’s time

1. Microcompaction

When tool outputs get large, Claude Code saves them to disk and keeps only a reference in the model context. Recent tool results stay “inline” so you can keep reasoning; older results become “stored on disk, retrievable by path.”

Applies to: Read, Bash, Grep, Glob, WebSearch, WebFetch, Edit, Write

The key idea is the split:

If you are implementing this yourself, treat microcompaction as a cache policy plus a persistence format:


2. Auto-Compaction

Auto-compaction is driven by headroom accounting. Claude Code deliberately reserves space for:

Once “free space” drops below the reserved headroom, compaction triggers.

/context command output showing context usage breakdown

Two operational details matter more than the exact numbers:


3. Manual Compaction (The “Task Boundary” Move)

Claude Code exposes a manual compaction trigger so you can compact at a task boundary (after a feature lands, after a refactor, before you pivot to something else).

It also supports a “focus hint” so the summary doesn’t bury the lead. Conceptually:

/compact                                    # Use defaults
/compact Focus on the API changes           # Custom focus
/compact Preserve the database schema decisions

For persistent customization, Claude Code can read a project-level instruction section and append it to every compaction prompt. The portable pattern is:

## Compact Instructions
When summarizing, focus on TypeScript code changes and
remember the mistakes made and how they were fixed.

The Compaction Prompt

When compaction triggers, the model gets a structured summarization job, not an open-ended “summarize this.”

The contract (paraphrased) looks like “write a working state that allows continuation without re-asking questions,” including:

Two small design choices matter:

  1. It’s a checklist, so the model can’t “forget” an important bucket.
  2. It asks for structured sections, so the system can store and re-inject it reliably.

Post-Compaction Restoration

After summarizing, Claude Code rebuilds context with:

  1. Boundary marker — marks the compaction point
  2. Summary message — the compressed working state
  3. Recent files — re-read the few most recently accessed files
  4. Todo list — preserve task state
  5. Plan state — if you were in a “plan required” workflow
  6. Hook outputs — if startup hooks injected context

The key insight is the file rehydration: the system re-reads what you were just working on, so you don’t lose your place.


The Continuation Message

After compaction, the summary gets wrapped in this message:

This session is being continued from a previous conversation that ran out
of context. The summary below covers the earlier portion of the conversation.

[SUMMARY]

Please continue the conversation from where we left it off without asking
the user any further questions. Continue with the last task that you were
asked to work on.

Tuning Knobs (Portable)

If you’re implementing compaction in your own agent, you’ll want explicit configuration for:


Aside: Background Task Summarization

Everything above covers your main conversation. This section describes how background agents manage their own context separately.

When you spawn background or remote agents (via the Task tool), they run in isolated context. Rather than producing a full “working state” snapshot every time, Claude Code uses delta summarization to track their progress:

You are given a few messages from a conversation, as well as a summary
of the conversation so far. Your task is to summarize the new messages
based on the summary so far. Aim for 1-2 sentences at most, focusing on
the most important details.

This incremental approach tracks progress without storing full context — each update builds on the previous summary rather than reprocessing everything. Your main conversation remains unaffected.


Summary

Claude Code’s compaction system:

  1. Microcompaction — preserve a small hot tail, offload bulky outputs to disk
  2. Auto-compaction — trigger compaction when free space falls below reserved headroom
  3. Structured summarization — preserve intent, decisions, errors, and “what to do next”
  4. Rehydration — re-read recent files and restore todo/plan state so work continues cleanly
  5. File restoration — re-reads your 5 most recent files after summarizing
  6. Continuation message — tells Claude to resume without re-asking what you want

Background agents use separate delta summarization (1-2 sentence incremental updates).


Best Practices

Compact at task boundaries — Don’t wait for auto-compact. Run /compact when you finish a feature or fix a bug, while context is clean.

Clear between unrelated tasks/clear resets context entirely. Better than polluting context with unrelated work.

Use subagents for exploration — Heavy exploration happens in separate context, keeping your main conversation clean.

Monitor with /context — See what’s consuming space. Disable unused MCP servers.


← Back to all posts