Skip to main content
Module 4: Mastery 4 / 6
Advanced Session 22 Human-in-the-Loop Approval

Human-in-the-Loop

Design confirmation flows, approval gates, and escalation patterns.

March 20, 2026 17 min read

What You’ll Learn

AI agents that act without oversight are dangerous. AI agents that ask permission on every step are annoying. The best human-in-the-loop designs sit in the middle — involving humans at the right moments with the right information.

By the end, you’ll understand:

  • The spectrum of autonomy and where each mode fits
  • Confirmation flows and approval gates
  • Escalation patterns for when the AI gets stuck
  • The trust gradient: starting restrictive, opening up over time

The Problem

Consider two extremes:

Fully Manual:                          Fully Autonomous:
  AI: "May I read auth.ts?"              AI: *reads 50 files, edits 12,
  User: "Yes"                                 runs deploy, pushes to prod*
  AI: "May I edit line 42?"              User: "Wait, I didn't mean--"
  User: "Yes"
  → Correct but exhausting               → Fast but terrifying

The real answer is in between. But where? It depends on the action, the stakes, and the trust level.

How It Works

The Autonomy Spectrum

┌──────────────────────────────────────────────────────┐
│  Manual       Guided       Semi-Auto     Fully Auto  │
│    │            │              │              │       │
│  Every       Plan first,    Auto for       Everything│
│  action      confirm        reads/search,  runs free │
│  prompted    execution      prompt writes            │
│                                                       │
│  claude      claude         claude         claude     │
│  --plan      (default)      (default)      --danger.. │
│                                                       │
│  ◄── More oversight            Less oversight ──►    │
└──────────────────────────────────────────────────────┘

The principle: reversibility determines autonomy. If you can undo it with git checkout, auto-allow it. If it changes production state, a human must approve.

Action TypeReversible?Approach
Read files, search codebaseYesAuto-allow
Edit local filesYes (git)Prompt or allow
Run testsYesAllow with allowlist
Run arbitrary commandsMaybePrompt user
Git push, deployNoAlways prompt
Database mutationsNoMulti-step confirmation

Approval Gates

An approval gate is a strategic checkpoint where the AI pauses and presents its plan before continuing. Unlike per-tool permission prompts, these are designed workflow pauses.

┌──────────────────────────────────────────────────────┐
│  Phase 1: Research (autonomous)                       │
│  ├── Read files, search codebase, analyze deps        │
│  ▼                                                    │
│  ╔════════════════════════════════════════════════╗   │
│  ║  GATE: Present findings + proposed plan        ║   │
│  ║  User reviews and approves/modifies            ║   │
│  ╚════════════════════════════════════════════════╝   │
│  ▼                                                    │
│  Phase 2: Implementation (semi-autonomous)            │
│  ├── Edit files, run tests                            │
│  ▼                                                    │
│  ╔════════════════════════════════════════════════╗   │
│  ║  GATE: Show results before deployment          ║   │
│  ║  "All tests pass. Ready to push?"              ║   │
│  ╚════════════════════════════════════════════════╝   │
│  ▼                                                    │
│  Phase 3: Deploy (human-controlled)                   │
│  └── Push only with explicit approval                 │
└──────────────────────────────────────────────────────┘

Encode approval gates in your CLAUDE.md:

## Workflow Rules
- Always present a plan before changing more than 3 files
- Never push to remote without explicit user confirmation
- After tests, report results and wait before proceeding

Escalation Patterns

When the AI encounters ambiguity or failure, it should escalate rather than guess. The decision depends on confidence and consequence:

Low ConsequenceHigh Consequence
High ConfidenceSelf-resolve silentlyInform and continue
Low ConfidenceAsk for clarificationStop and report
Level 1: Self-resolve     → AI fixes the error on its own
Level 2: Inform + continue → AI notes the issue, keeps going
Level 3: Ask clarification → AI asks a specific question, waits
Level 4: Stop and report   → AI cannot proceed safely, presents options

The Trust Gradient

Trust builds over time. Claude Code supports this through permission allowlists in .claude/settings.json:

{
  "permissions": {
    "allow": [
      "Bash(npm run test)",
      "Bash(npm run lint)",
      "Bash(git status)",
      "Bash(git diff)"
    ]
  }
}

Each allowed pattern is a trust decision. Over weeks of use, your allowlist grows as you gain confidence in the AI’s behavior. Session 1 prompts for everything. Session 50 auto-allows most operations.

Key Insight

The best human-in-the-loop design makes human involvement feel natural, not annoying. The goal is not to minimize human interaction — it is to maximize the value of each interaction.

Bad HITL asks: “May I read this file?” (obviously yes, always) Good HITL asks: “I found 3 approaches. Which do you prefer?” (genuine decision point)

The framework: auto-allow no-risk actions, gate at strategic decision points, escalate when uncertain, and never ask questions with obvious answers. This mirrors how you manage a junior developer — you review their plan and check their PR, not watch them type.

Hands-On Example

A deployment workflow with approval gates:

## Deploy Workflow (in CLAUDE.md)

When I ask you to deploy:
1. RESEARCH (autonomous): check git status, run tests, check deps
2. GATE 1: show test results, git status, warnings. Wait for "proceed"
3. BUILD (autonomous): run production build, verify output
4. GATE 2: show build size, warnings. Ask "Ready to push to main?"
5. DEPLOY: only with explicit "yes" -- run git push origin main

The interaction:

User: "Deploy the latest changes"

AI: [runs tests, checks status -- autonomous]
AI: "Pre-deploy: 47/47 tests pass, 3 files changed, no vulns. Build?"

User: "Yes"

AI: [builds -- autonomous]
AI: "Build: 245KB bundle, no warnings. Push to main?"

User: "Push it"

AI: [pushes, reports status]

Two human touchpoints. Each meaningful. Irreversible actions gated. Reversible actions automatic.

What Changed

No HITL DesignWith HITL Design
AI asks about everything or nothingStrategic checkpoints at decision points
Users feel anxious or annoyedUsers feel informed and in control
Trust is binary (on/off)Trust builds gradually via allowlists
Failures are silent or catastrophicFailures escalate at the right level

Next Session

Session 23 dives into Custom Agents & Skills — how to build your own Markdown-based extensions that encode workflow knowledge into reusable, shareable components.