Module 3: Real Architecture 4 / 6

Advanced Session 16 Session Storage JSONL

Session Storage

Learn about JSONL transcripts with UUID parent-child chains.

March 20, 2026 · 15 min read

What You’ll Learn

Every conversation you have with Claude Code is recorded. Not as a vague memory, but as a precise, line-by-line transcript in JSONL format. This is how sessions persist across restarts, how you can resume work days later, and how the system maintains continuity.

By the end, you’ll understand:

How Claude Code persists conversation state to disk
The JSONL (JSON Lines) format and why it was chosen
Session IDs and the UUID parent-child chain
What data is stored in each transcript entry
How session resumption works with claude --resume
How old sessions are cleaned up over time

The Problem

When you close your terminal mid-task, everything in memory is gone. The context window, the plan, the files you were working on — all lost.

Without persistence, every session starts from zero:

Monday:  "Refactor the auth module" → 45 min of work → close terminal
Tuesday: "Refactor the auth module" → starts over from scratch

This is unacceptable for serious development work. You need the AI to pick up where you left off, understanding what was done, what failed, and what remains.

The solution: write every message exchange to disk as it happens, in a format that can be replayed.

How It Works

The JSONL Format

Claude Code stores transcripts as JSONL — JSON Lines — files. Each line is a self-contained JSON object representing one event in the conversation:

{"type":"message","role":"user","content":"Fix the auth bug","timestamp":"2026-03-20T10:00:01Z","sessionId":"a1b2c3d4-..."}
{"type":"message","role":"assistant","content":[{"type":"text","text":"Let me look at the auth module..."},{"type":"tool_use","id":"tu_1","name":"Read","input":{"file_path":"src/auth.ts"}}],"timestamp":"2026-03-20T10:00:03Z","sessionId":"a1b2c3d4-..."}
{"type":"tool_result","tool_use_id":"tu_1","content":"export function validateToken(token: string) { ... }","timestamp":"2026-03-20T10:00:03Z","sessionId":"a1b2c3d4-..."}

Why JSONL instead of a single JSON array?

Format	Append Cost	Parse Cost	Corruption Resistance
JSON array `[...]`	Rewrite entire file	Parse entire file	One bad byte corrupts all
JSONL (one per line)	Append one line	Parse line by line	Bad line skippable

JSONL is append-only, which means each new message is simply written to the end of the file. No rewriting, no locking, no corruption risk for existing data.

Session IDs: UUID Identification

Every session gets a UUID (Universally Unique Identifier) when it starts:

Session: a1b2c3d4-e5f6-7890-abcd-ef1234567890

This UUID appears in every line of the transcript, linking all events to their session. It also appears in the filename:

~/.claude/projects/<project-hash>/
  sessions/
    a1b2c3d4-e5f6-7890-abcd-ef1234567890.jsonl
    b2c3d4e5-f6a7-8901-bcde-f12345678901.jsonl
    c3d4e5f6-a7b8-9012-cdef-123456789012.jsonl

Parent-Child Session Chains

Sessions are not isolated — they form chains. When you resume a session or spawn a subagent, a parent-child relationship is created:

┌──────────────────────────────────────────────┐
│           Session Chain                       │
│                                               │
│  ┌─────────────────────┐                      │
│  │ Session A (original) │                     │
│  │ ID: a1b2c3d4-...     │                     │
│  │ parent: null          │                     │
│  └─────────┬────────────┘                     │
│            │                                  │
│    ┌───────┴────────┐                         │
│    │                │                         │
│    ▼                ▼                         │
│  ┌──────────┐  ┌──────────┐                   │
│  │ Session B │  │ Session C │                  │
│  │ (resumed) │  │ (subagent)│                  │
│  │ parent:   │  │ parent:   │                  │
│  │ a1b2c3d4  │  │ a1b2c3d4  │                  │
│  └──────────┘  └─────┬─────┘                  │
│                      │                        │
│                      ▼                        │
│                ┌──────────┐                   │
│                │ Session D │                  │
│                │ (sub-sub) │                  │
│                │ parent:   │                  │
│                │ c3d4e5f6  │                  │
│                └──────────┘                   │
│                                               │
└──────────────────────────────────────────────┘

Each session entry stores the parent session ID, forming a tree:

{
  "sessionId": "b2c3d4e5-f6a7-8901-bcde-f12345678901",
  "parentSessionId": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "type": "session_start",
  "timestamp": "2026-03-20T14:30:00Z",
  "resumedFrom": "a1b2c3d4-e5f6-7890-abcd-ef1234567890"
}

This chain enables the system to:

Trace the full history of a task across multiple sittings
Understand which subagent sessions belong to which parent
Reconstruct the full context when resuming

What Gets Stored

Each line in the JSONL transcript captures a specific event type:

Event Type	Contains	Purpose
`session_start`	Session ID, parent ID, timestamp, project path	Marks beginning of a session
`message`	Role, content blocks, timestamp	User or assistant messages
`tool_use`	Tool name, input parameters, tool_use_id	What the AI wants to do
`tool_result`	tool_use_id, output content, duration	What actually happened
`compaction`	Summary text, tokens saved	Context was compacted
`session_end`	Duration, token count, cost	Session statistics

A typical transcript entry for a tool call looks like:

{
  "type": "tool_use",
  "sessionId": "a1b2c3d4-...",
  "toolName": "Bash",
  "toolInput": {
    "command": "npm test",
    "description": "Run test suite"
  },
  "toolUseId": "tu_abc123",
  "timestamp": "2026-03-20T10:05:22Z"
}

Followed by its result:

{
  "type": "tool_result",
  "sessionId": "a1b2c3d4-...",
  "toolUseId": "tu_abc123",
  "content": "Tests: 14 passed, 2 failed",
  "durationMs": 3400,
  "timestamp": "2026-03-20T10:05:25Z"
}

Session Resumption

When you run claude --resume, the system:

Lists recent sessions for the current project
Reads the JSONL transcript of the selected session
Reconstructs the message array from the transcript events
Applies compaction if the restored context is too large
Creates a new child session linked to the original via parentSessionId

claude --resume
       │
       ▼
┌──────────────┐
│ Find recent   │
│ sessions for  │
│ this project  │
└──────┬───────┘
       │
       ▼
┌──────────────┐
│ Parse JSONL   │
│ transcript    │
│ line by line  │
└──────┬───────┘
       │
       ▼
┌──────────────┐
│ Rebuild       │
│ message array │
│ (replay)      │
└──────┬───────┘
       │
       ▼
┌──────────────┐     ┌──────────────┐
│ Context too   │─Yes─│ Compact to   │
│ large?        │     │ fit window   │
└──────┬───────┘     └──────┬───────┘
       │ No                  │
       ▼                     ▼
┌──────────────────────────────┐
│ New session (child of        │
│ original), ready to continue │
└──────────────────────────────┘

The resumed session has full awareness of what happened before. It knows which files were edited, what commands were run, and what the user’s goal was.

Transcript Cleanup

Sessions accumulate over time. Claude Code manages this through cleanup:

Recent sessions are kept intact for fast resumption
Older sessions may have their transcripts compacted or trimmed
Very old sessions are eventually removed from disk
The project directory is scoped by a hash of the project path, keeping sessions from different projects separate

The storage location uses a hash of the project’s absolute path:

~/.claude/projects/
  abc123def/          ← hash of /Users/you/project-alpha
    sessions/
    settings.json
  fed321cba/          ← hash of /Users/you/project-beta
    sessions/
    settings.json

Key Insight

Session storage turns ephemeral conversations into durable work logs. The JSONL format is not just an implementation detail — it reflects a fundamental design philosophy: conversations are append-only event streams, not mutable documents.

This matters because:

You can resume any session — close your laptop, reopen it tomorrow, and continue exactly where you left off
Subagent work is traceable — every subagent creates its own session linked to the parent, so you can audit what happened
The format is human-readable — you can open a JSONL file and read through the conversation manually
Crash recovery is built in — since each line is independently valid, a crash mid-write only loses the last partial line

The parent-child chain is particularly powerful. It means the system can reconstruct not just a single conversation, but an entire tree of work — the main session, the subagents it spawned, the subagents those subagents spawned — all linked by UUIDs.

Hands-On Example

Inspecting a Session Transcript

You can explore your own session transcripts directly:

# Find your project's session directory
ls ~/.claude/projects/

# List recent sessions (sorted by modification time)
ls -lt ~/.claude/projects/<project-hash>/sessions/ | head -10

# Read the first few lines of a transcript
head -5 ~/.claude/projects/<project-hash>/sessions/<session-id>.jsonl

# Pretty-print a single line to see its structure
head -1 sessions/<id>.jsonl | python3 -m json.tool

# Count how many tool calls happened in a session
grep -c '"type":"tool_use"' sessions/<id>.jsonl

# Find all file edits in a session
grep '"toolName":"Edit"' sessions/<id>.jsonl | python3 -m json.tool

Building a Session Summary Script

Here is a simple script to summarize what happened in a session:

import json
import sys

def summarize_session(filepath):
    tools_used = {}
    messages = 0
    start_time = None
    end_time = None

    with open(filepath) as f:
        for line in f:
            entry = json.loads(line.strip())
            timestamp = entry.get("timestamp")

            if not start_time:
                start_time = timestamp
            end_time = timestamp

            if entry.get("type") == "message":
                messages += 1
            elif entry.get("type") == "tool_use":
                tool = entry.get("toolName", "unknown")
                tools_used[tool] = tools_used.get(tool, 0) + 1

    print(f"Session: {filepath}")
    print(f"Duration: {start_time} to {end_time}")
    print(f"Messages: {messages}")
    print(f"Tool calls:")
    for tool, count in sorted(tools_used.items(), key=lambda x: -x[1]):
        print(f"  {tool}: {count}")

summarize_session(sys.argv[1])

This gives you output like:

Session: sessions/a1b2c3d4-....jsonl
Duration: 2026-03-20T10:00:01Z to 2026-03-20T10:45:33Z
Messages: 34
Tool calls:
  Read: 12
  Edit: 8
  Bash: 6
  Grep: 4
  Glob: 3

What Changed

Without Session Storage	With JSONL Transcripts
Every session starts from zero	Resume any previous session
No record of past work	Full audit trail of every action
Subagent work is invisible	Parent-child chains trace all work
Crash loses everything	Append-only format survives crashes
Context is memory-only	Context is reconstructed from disk

Next Session

Now that you understand how sessions are stored and linked, Session 17 tackles the most impactful file in any Claude Code project: CLAUDE.md Design — how to engineer your project instructions for maximum effectiveness, and the 3-tier hierarchy that controls the AI’s behavior.