Skip to main content
Module 3: Real Architecture 4 / 6
Advanced Session 16 Session Storage JSONL

Session Storage

Learn about JSONL transcripts with UUID parent-child chains.

March 20, 2026 15 min read

What You’ll Learn

Every conversation you have with Claude Code is recorded. Not as a vague memory, but as a precise, line-by-line transcript in JSONL format. This is how sessions persist across restarts, how you can resume work days later, and how the system maintains continuity.

By the end, you’ll understand:

  • How Claude Code persists conversation state to disk
  • The JSONL (JSON Lines) format and why it was chosen
  • Session IDs and the UUID parent-child chain
  • What data is stored in each transcript entry
  • How session resumption works with claude --resume
  • How old sessions are cleaned up over time

The Problem

When you close your terminal mid-task, everything in memory is gone. The context window, the plan, the files you were working on — all lost.

Without persistence, every session starts from zero:

Monday:  "Refactor the auth module" → 45 min of work → close terminal
Tuesday: "Refactor the auth module" → starts over from scratch

This is unacceptable for serious development work. You need the AI to pick up where you left off, understanding what was done, what failed, and what remains.

The solution: write every message exchange to disk as it happens, in a format that can be replayed.

How It Works

The JSONL Format

Claude Code stores transcripts as JSONL — JSON Lines — files. Each line is a self-contained JSON object representing one event in the conversation:

{"type":"message","role":"user","content":"Fix the auth bug","timestamp":"2026-03-20T10:00:01Z","sessionId":"a1b2c3d4-..."}
{"type":"message","role":"assistant","content":[{"type":"text","text":"Let me look at the auth module..."},{"type":"tool_use","id":"tu_1","name":"Read","input":{"file_path":"src/auth.ts"}}],"timestamp":"2026-03-20T10:00:03Z","sessionId":"a1b2c3d4-..."}
{"type":"tool_result","tool_use_id":"tu_1","content":"export function validateToken(token: string) { ... }","timestamp":"2026-03-20T10:00:03Z","sessionId":"a1b2c3d4-..."}

Why JSONL instead of a single JSON array?

FormatAppend CostParse CostCorruption Resistance
JSON array [...]Rewrite entire fileParse entire fileOne bad byte corrupts all
JSONL (one per line)Append one lineParse line by lineBad line skippable

JSONL is append-only, which means each new message is simply written to the end of the file. No rewriting, no locking, no corruption risk for existing data.

Session IDs: UUID Identification

Every session gets a UUID (Universally Unique Identifier) when it starts:

Session: a1b2c3d4-e5f6-7890-abcd-ef1234567890

This UUID appears in every line of the transcript, linking all events to their session. It also appears in the filename:

~/.claude/projects/<project-hash>/
  sessions/
    a1b2c3d4-e5f6-7890-abcd-ef1234567890.jsonl
    b2c3d4e5-f6a7-8901-bcde-f12345678901.jsonl
    c3d4e5f6-a7b8-9012-cdef-123456789012.jsonl

Parent-Child Session Chains

Sessions are not isolated — they form chains. When you resume a session or spawn a subagent, a parent-child relationship is created:

┌──────────────────────────────────────────────┐
│           Session Chain                       │
│                                               │
│  ┌─────────────────────┐                      │
│  │ Session A (original) │                     │
│  │ ID: a1b2c3d4-...     │                     │
│  │ parent: null          │                     │
│  └─────────┬────────────┘                     │
│            │                                  │
│    ┌───────┴────────┐                         │
│    │                │                         │
│    ▼                ▼                         │
│  ┌──────────┐  ┌──────────┐                   │
│  │ Session B │  │ Session C │                  │
│  │ (resumed) │  │ (subagent)│                  │
│  │ parent:   │  │ parent:   │                  │
│  │ a1b2c3d4  │  │ a1b2c3d4  │                  │
│  └──────────┘  └─────┬─────┘                  │
│                      │                        │
│                      ▼                        │
│                ┌──────────┐                   │
│                │ Session D │                  │
│                │ (sub-sub) │                  │
│                │ parent:   │                  │
│                │ c3d4e5f6  │                  │
│                └──────────┘                   │
│                                               │
└──────────────────────────────────────────────┘

Each session entry stores the parent session ID, forming a tree:

{
  "sessionId": "b2c3d4e5-f6a7-8901-bcde-f12345678901",
  "parentSessionId": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "type": "session_start",
  "timestamp": "2026-03-20T14:30:00Z",
  "resumedFrom": "a1b2c3d4-e5f6-7890-abcd-ef1234567890"
}

This chain enables the system to:

  • Trace the full history of a task across multiple sittings
  • Understand which subagent sessions belong to which parent
  • Reconstruct the full context when resuming

What Gets Stored

Each line in the JSONL transcript captures a specific event type:

Event TypeContainsPurpose
session_startSession ID, parent ID, timestamp, project pathMarks beginning of a session
messageRole, content blocks, timestampUser or assistant messages
tool_useTool name, input parameters, tool_use_idWhat the AI wants to do
tool_resulttool_use_id, output content, durationWhat actually happened
compactionSummary text, tokens savedContext was compacted
session_endDuration, token count, costSession statistics

A typical transcript entry for a tool call looks like:

{
  "type": "tool_use",
  "sessionId": "a1b2c3d4-...",
  "toolName": "Bash",
  "toolInput": {
    "command": "npm test",
    "description": "Run test suite"
  },
  "toolUseId": "tu_abc123",
  "timestamp": "2026-03-20T10:05:22Z"
}

Followed by its result:

{
  "type": "tool_result",
  "sessionId": "a1b2c3d4-...",
  "toolUseId": "tu_abc123",
  "content": "Tests: 14 passed, 2 failed",
  "durationMs": 3400,
  "timestamp": "2026-03-20T10:05:25Z"
}

Session Resumption

When you run claude --resume, the system:

  1. Lists recent sessions for the current project
  2. Reads the JSONL transcript of the selected session
  3. Reconstructs the message array from the transcript events
  4. Applies compaction if the restored context is too large
  5. Creates a new child session linked to the original via parentSessionId
claude --resume


┌──────────────┐
│ Find recent   │
│ sessions for  │
│ this project  │
└──────┬───────┘


┌──────────────┐
│ Parse JSONL   │
│ transcript    │
│ line by line  │
└──────┬───────┘


┌──────────────┐
│ Rebuild       │
│ message array │
│ (replay)      │
└──────┬───────┘


┌──────────────┐     ┌──────────────┐
│ Context too   │─Yes─│ Compact to   │
│ large?        │     │ fit window   │
└──────┬───────┘     └──────┬───────┘
       │ No                  │
       ▼                     ▼
┌──────────────────────────────┐
│ New session (child of        │
│ original), ready to continue │
└──────────────────────────────┘

The resumed session has full awareness of what happened before. It knows which files were edited, what commands were run, and what the user’s goal was.

Transcript Cleanup

Sessions accumulate over time. Claude Code manages this through cleanup:

  • Recent sessions are kept intact for fast resumption
  • Older sessions may have their transcripts compacted or trimmed
  • Very old sessions are eventually removed from disk
  • The project directory is scoped by a hash of the project path, keeping sessions from different projects separate

The storage location uses a hash of the project’s absolute path:

~/.claude/projects/
  abc123def/          ← hash of /Users/you/project-alpha
    sessions/
    settings.json
  fed321cba/          ← hash of /Users/you/project-beta
    sessions/
    settings.json

Key Insight

Session storage turns ephemeral conversations into durable work logs. The JSONL format is not just an implementation detail — it reflects a fundamental design philosophy: conversations are append-only event streams, not mutable documents.

This matters because:

  • You can resume any session — close your laptop, reopen it tomorrow, and continue exactly where you left off
  • Subagent work is traceable — every subagent creates its own session linked to the parent, so you can audit what happened
  • The format is human-readable — you can open a JSONL file and read through the conversation manually
  • Crash recovery is built in — since each line is independently valid, a crash mid-write only loses the last partial line

The parent-child chain is particularly powerful. It means the system can reconstruct not just a single conversation, but an entire tree of work — the main session, the subagents it spawned, the subagents those subagents spawned — all linked by UUIDs.

Hands-On Example

Inspecting a Session Transcript

You can explore your own session transcripts directly:

# Find your project's session directory
ls ~/.claude/projects/

# List recent sessions (sorted by modification time)
ls -lt ~/.claude/projects/<project-hash>/sessions/ | head -10

# Read the first few lines of a transcript
head -5 ~/.claude/projects/<project-hash>/sessions/<session-id>.jsonl

# Pretty-print a single line to see its structure
head -1 sessions/<id>.jsonl | python3 -m json.tool

# Count how many tool calls happened in a session
grep -c '"type":"tool_use"' sessions/<id>.jsonl

# Find all file edits in a session
grep '"toolName":"Edit"' sessions/<id>.jsonl | python3 -m json.tool

Building a Session Summary Script

Here is a simple script to summarize what happened in a session:

import json
import sys

def summarize_session(filepath):
    tools_used = {}
    messages = 0
    start_time = None
    end_time = None

    with open(filepath) as f:
        for line in f:
            entry = json.loads(line.strip())
            timestamp = entry.get("timestamp")

            if not start_time:
                start_time = timestamp
            end_time = timestamp

            if entry.get("type") == "message":
                messages += 1
            elif entry.get("type") == "tool_use":
                tool = entry.get("toolName", "unknown")
                tools_used[tool] = tools_used.get(tool, 0) + 1

    print(f"Session: {filepath}")
    print(f"Duration: {start_time} to {end_time}")
    print(f"Messages: {messages}")
    print(f"Tool calls:")
    for tool, count in sorted(tools_used.items(), key=lambda x: -x[1]):
        print(f"  {tool}: {count}")

summarize_session(sys.argv[1])

This gives you output like:

Session: sessions/a1b2c3d4-....jsonl
Duration: 2026-03-20T10:00:01Z to 2026-03-20T10:45:33Z
Messages: 34
Tool calls:
  Read: 12
  Edit: 8
  Bash: 6
  Grep: 4
  Glob: 3

What Changed

Without Session StorageWith JSONL Transcripts
Every session starts from zeroResume any previous session
No record of past workFull audit trail of every action
Subagent work is invisibleParent-child chains trace all work
Crash loses everythingAppend-only format survives crashes
Context is memory-onlyContext is reconstructed from disk

Next Session

Now that you understand how sessions are stored and linked, Session 17 tackles the most impactful file in any Claude Code project: CLAUDE.md Design — how to engineer your project instructions for maximum effectiveness, and the 3-tier hierarchy that controls the AI’s behavior.