Context Compaction

Learn how Claude Code manages context window limits with a 3-layer compression strategy — keeping conversations productive even on long sessions.

March 20, 2026 · 15 min read

What You’ll Learn

Context windows are finite. Even with 200K tokens, a long coding session with many file reads and tool calls will eventually hit the limit. When it does, Claude Code doesn’t just stop — it compacts.

By the end, you’ll understand:

Why context limits matter in practice
The 3-layer compression strategy
What information survives compaction
How to structure work for better context efficiency

The Problem

A typical coding session might look like:

System prompt:        ~3,000 tokens
CLAUDE.md:            ~1,000 tokens
Skill definitions:    ~2,000 tokens
File reads (10 files): ~15,000 tokens
Tool outputs:         ~5,000 tokens
Conversation:         ~10,000 tokens
─────────────────────────────────────
Total:                ~36,000 tokens

Now imagine a longer session with 50 file reads, multiple searches, and several rounds of edits. You can easily hit 100K+ tokens. At some point, the context window needs to be trimmed.

How It Works

The 3-Layer Strategy

Claude Code uses a progressive compression approach:

┌──────────────────────────────────────────────┐
│           Context Compaction                  │
│                                               │
│  Layer 1: Trim Tool Outputs                   │
│  ┌───────────────────────────────────────┐    │
│  │ Long tool results → truncated         │    │
│  │ "File contents (5000 lines)..."       │    │
│  │ → "File contents (first 200 lines)..." │    │
│  └───────────────────────────────────────┘    │
│                                               │
│  Layer 2: Summarize Old Messages              │
│  ┌───────────────────────────────────────┐    │
│  │ Early conversation turns →             │    │
│  │ compressed summary                     │    │
│  │                                        │    │
│  │ Turn 1-15: "User asked to fix auth     │    │
│  │ bug. Found issue in middleware.         │    │
│  │ Fixed JWT validation. Tests pass."     │    │
│  └───────────────────────────────────────┘    │
│                                               │
│  Layer 3: Preserve Recent Context             │
│  ┌───────────────────────────────────────┐    │
│  │ Last N turns kept in full              │    │
│  │ System prompt always kept              │    │
│  │ CLAUDE.md always kept                  │    │
│  │ Active task list always kept           │    │
│  └───────────────────────────────────────┘    │
│                                               │
└──────────────────────────────────────────────┘

What Survives Compaction

Always Preserved	Compressed	Removed
System prompt	Old conversation turns	Redundant tool outputs
CLAUDE.md	Early file reads	Superseded edits
Current task list	Previous search results	Intermediate exploration
Recent conversation	Old error messages	Resolved error details

The Compaction Trigger

Compaction happens automatically when context usage approaches the limit:

Context usage: ████████████████░░░░ 80%  ← Normal
Context usage: ██████████████████░░ 90%  ← Approaching limit
Context usage: ████████████████████ 95%  ← COMPACT NOW

After compaction:
Context usage: ████████████░░░░░░░░ 60%  ← Room to continue

Preserving Critical Information

The system is smart about what to preserve:

System instructions — Always kept (defines behavior)
User’s original request — Always kept (the goal)
Current state — What files have been modified, what’s pending
Recent turns — The last several interactions in full
Summary of earlier work — Compressed but not lost

Key Insight

Compaction is not deletion — it’s summarization. The AI doesn’t lose awareness of what happened earlier; it loses the verbatim details but keeps the essence.

This has practical implications for how you work:

Long sessions are fine — compaction keeps them productive
Critical context should be in CLAUDE.md — it’s never compacted
Break large tasks into phases — each phase can start with a fresh context
Use subagents for research — their context is separate and discarded after

The biggest mistake users make is worrying about context limits. In practice, compaction handles it automatically. The system degrades gracefully — you’ll notice slightly less detailed recall of earlier conversation, but the work continues.

Hands-On Example

Optimizing for Context Efficiency

Structure your work to minimize context waste:

# Inefficient: loading everything then working
"Read all files in src/, then fix the bug in auth.ts"
→ Loads 50 files, only 1 is relevant

# Efficient: targeted exploration
"The bug is in authentication. Check src/auth/ first."
→ Loads 3-5 files, context stays lean

# Even better: use subagents for exploration
"Use an Explore agent to find where authentication
 is handled, then fix the bug."
→ Exploration context is isolated in subagent

Monitoring Context Usage

Claude Code shows context usage in the UI. Watch for the indicator:

─────────────────────────────────────
Context: ████████████████░░░░ 80%
─────────────────────────────────────

When you see high context usage:

Consider starting a new session for unrelated tasks
Use /compact to manually trigger compaction
Delegate research to subagents

What Makes a Good Summary

When compaction summarizes your earlier work, it captures:

Summary of turns 1-25:
- User asked to implement user authentication for Express app
- Explored codebase: Express 4.x, PostgreSQL, no existing auth
- Decided on JWT + bcrypt approach
- Created: src/middleware/auth.ts, src/routes/auth.ts
- Modified: src/app.ts (added auth middleware)
- Added: user table migration
- Tests written and passing (4 tests)
- Current task: Adding password reset flow

This summary preserves the key decisions and state without the 25 turns of back-and-forth.

What Changed

No Compaction	With 3-Layer Compaction
Session ends when context fills	Sessions can continue indefinitely
All history at full detail	Progressive compression
Context crisis is abrupt	Gradual, graceful degradation
Must restart for long tasks	Automatic management
Critical info can be lost	System prompt + CLAUDE.md preserved

Next Session

This completes Module 1: Core Agent! You now understand the fundamental building blocks: the agent loop, tools, planning, subagents, knowledge loading, and context management.

In Module 2, we scale up to multi-agent systems. Session 7 starts with Task Graph & Dependencies — how Claude Code coordinates multiple tasks with dependency edges, enabling complex workflows.