Human-in-the-Loop - ClaudeWorld

What You’ll Learn

AI agents that act without oversight are dangerous. AI agents that ask permission on every step are annoying. The best human-in-the-loop designs sit in the middle — involving humans at the right moments with the right information.

By the end, you’ll understand:

The spectrum of autonomy and where each mode fits
Confirmation flows and approval gates
Escalation patterns for when the AI gets stuck
The trust gradient: starting restrictive, opening up over time

The Problem

Consider two extremes:

Fully Manual:                          Fully Autonomous:
  AI: "May I read auth.ts?"              AI: *reads 50 files, edits 12,
  User: "Yes"                                 runs deploy, pushes to prod*
  AI: "May I edit line 42?"              User: "Wait, I didn't mean--"
  User: "Yes"
  → Correct but exhausting               → Fast but terrifying

The real answer is in between. But where? It depends on the action, the stakes, and the trust level.

How It Works

The Autonomy Spectrum

┌──────────────────────────────────────────────────────┐
│  Manual       Guided       Semi-Auto     Fully Auto  │
│    │            │              │              │       │
│  Every       Plan first,    Auto for       Everything│
│  action      confirm        reads/search,  runs free │
│  prompted    execution      prompt writes            │
│                                                       │
│  claude      claude         claude         claude     │
│  --plan      (default)      (default)      --danger.. │
│                                                       │
│  ◄── More oversight            Less oversight ──►    │
└──────────────────────────────────────────────────────┘

The principle: reversibility determines autonomy. If you can undo it with git checkout, auto-allow it. If it changes production state, a human must approve.

Action Type	Reversible?	Approach
Read files, search codebase	Yes	Auto-allow
Edit local files	Yes (git)	Prompt or allow
Run tests	Yes	Allow with allowlist
Run arbitrary commands	Maybe	Prompt user
Git push, deploy	No	Always prompt
Database mutations	No	Multi-step confirmation

Approval Gates

An approval gate is a strategic checkpoint where the AI pauses and presents its plan before continuing. Unlike per-tool permission prompts, these are designed workflow pauses.

┌──────────────────────────────────────────────────────┐
│  Phase 1: Research (autonomous)                       │
│  ├── Read files, search codebase, analyze deps        │
│  ▼                                                    │
│  ╔════════════════════════════════════════════════╗   │
│  ║  GATE: Present findings + proposed plan        ║   │
│  ║  User reviews and approves/modifies            ║   │
│  ╚════════════════════════════════════════════════╝   │
│  ▼                                                    │
│  Phase 2: Implementation (semi-autonomous)            │
│  ├── Edit files, run tests                            │
│  ▼                                                    │
│  ╔════════════════════════════════════════════════╗   │
│  ║  GATE: Show results before deployment          ║   │
│  ║  "All tests pass. Ready to push?"              ║   │
│  ╚════════════════════════════════════════════════╝   │
│  ▼                                                    │
│  Phase 3: Deploy (human-controlled)                   │
│  └── Push only with explicit approval                 │
└──────────────────────────────────────────────────────┘

Encode approval gates in your CLAUDE.md:

## Workflow Rules
- Always present a plan before changing more than 3 files
- Never push to remote without explicit user confirmation
- After tests, report results and wait before proceeding

Escalation Patterns

When the AI encounters ambiguity or failure, it should escalate rather than guess. The decision depends on confidence and consequence:

	Low Consequence	High Consequence
High Confidence	Self-resolve silently	Inform and continue
Low Confidence	Ask for clarification	Stop and report

Level 1: Self-resolve     → AI fixes the error on its own
Level 2: Inform + continue → AI notes the issue, keeps going
Level 3: Ask clarification → AI asks a specific question, waits
Level 4: Stop and report   → AI cannot proceed safely, presents options

The Trust Gradient

Trust builds over time. Claude Code supports this through permission allowlists in .claude/settings.json:

{
  "permissions": {
    "allow": [
      "Bash(npm run test)",
      "Bash(npm run lint)",
      "Bash(git status)",
      "Bash(git diff)"
    ]
  }
}

Each allowed pattern is a trust decision. Over weeks of use, your allowlist grows as you gain confidence in the AI’s behavior. Session 1 prompts for everything. Session 50 auto-allows most operations.

Key Insight

The best human-in-the-loop design makes human involvement feel natural, not annoying. The goal is not to minimize human interaction — it is to maximize the value of each interaction.

Bad HITL asks: “May I read this file?” (obviously yes, always) Good HITL asks: “I found 3 approaches. Which do you prefer?” (genuine decision point)

The framework: auto-allow no-risk actions, gate at strategic decision points, escalate when uncertain, and never ask questions with obvious answers. This mirrors how you manage a junior developer — you review their plan and check their PR, not watch them type.

Hands-On Example

A deployment workflow with approval gates:

## Deploy Workflow (in CLAUDE.md)

When I ask you to deploy:
1. RESEARCH (autonomous): check git status, run tests, check deps
2. GATE 1: show test results, git status, warnings. Wait for "proceed"
3. BUILD (autonomous): run production build, verify output
4. GATE 2: show build size, warnings. Ask "Ready to push to main?"
5. DEPLOY: only with explicit "yes" -- run git push origin main

The interaction:

User: "Deploy the latest changes"

AI: [runs tests, checks status -- autonomous]
AI: "Pre-deploy: 47/47 tests pass, 3 files changed, no vulns. Build?"

User: "Yes"

AI: [builds -- autonomous]
AI: "Build: 245KB bundle, no warnings. Push to main?"

User: "Push it"

AI: [pushes, reports status]

Two human touchpoints. Each meaningful. Irreversible actions gated. Reversible actions automatic.

What Changed

No HITL Design	With HITL Design
AI asks about everything or nothing	Strategic checkpoints at decision points
Users feel anxious or annoyed	Users feel informed and in control
Trust is binary (on/off)	Trust builds gradually via allowlists
Failures are silent or catastrophic	Failures escalate at the right level

Next Session

Session 23 dives into Custom Agents & Skills — how to build your own Markdown-based extensions that encode workflow knowledge into reusable, shareable components.