Human-in-the-Loop
Design confirmation flows, approval gates, and escalation patterns.
What You’ll Learn
AI agents that act without oversight are dangerous. AI agents that ask permission on every step are annoying. The best human-in-the-loop designs sit in the middle — involving humans at the right moments with the right information.
By the end, you’ll understand:
- The spectrum of autonomy and where each mode fits
- Confirmation flows and approval gates
- Escalation patterns for when the AI gets stuck
- The trust gradient: starting restrictive, opening up over time
The Problem
Consider two extremes:
Fully Manual: Fully Autonomous:
AI: "May I read auth.ts?" AI: *reads 50 files, edits 12,
User: "Yes" runs deploy, pushes to prod*
AI: "May I edit line 42?" User: "Wait, I didn't mean--"
User: "Yes"
→ Correct but exhausting → Fast but terrifying
The real answer is in between. But where? It depends on the action, the stakes, and the trust level.
How It Works
The Autonomy Spectrum
┌──────────────────────────────────────────────────────┐
│ Manual Guided Semi-Auto Fully Auto │
│ │ │ │ │ │
│ Every Plan first, Auto for Everything│
│ action confirm reads/search, runs free │
│ prompted execution prompt writes │
│ │
│ claude claude claude claude │
│ --plan (default) (default) --danger.. │
│ │
│ ◄── More oversight Less oversight ──► │
└──────────────────────────────────────────────────────┘
The principle: reversibility determines autonomy. If you can undo it with git checkout, auto-allow it. If it changes production state, a human must approve.
| Action Type | Reversible? | Approach |
|---|---|---|
| Read files, search codebase | Yes | Auto-allow |
| Edit local files | Yes (git) | Prompt or allow |
| Run tests | Yes | Allow with allowlist |
| Run arbitrary commands | Maybe | Prompt user |
| Git push, deploy | No | Always prompt |
| Database mutations | No | Multi-step confirmation |
Approval Gates
An approval gate is a strategic checkpoint where the AI pauses and presents its plan before continuing. Unlike per-tool permission prompts, these are designed workflow pauses.
┌──────────────────────────────────────────────────────┐
│ Phase 1: Research (autonomous) │
│ ├── Read files, search codebase, analyze deps │
│ ▼ │
│ ╔════════════════════════════════════════════════╗ │
│ ║ GATE: Present findings + proposed plan ║ │
│ ║ User reviews and approves/modifies ║ │
│ ╚════════════════════════════════════════════════╝ │
│ ▼ │
│ Phase 2: Implementation (semi-autonomous) │
│ ├── Edit files, run tests │
│ ▼ │
│ ╔════════════════════════════════════════════════╗ │
│ ║ GATE: Show results before deployment ║ │
│ ║ "All tests pass. Ready to push?" ║ │
│ ╚════════════════════════════════════════════════╝ │
│ ▼ │
│ Phase 3: Deploy (human-controlled) │
│ └── Push only with explicit approval │
└──────────────────────────────────────────────────────┘
Encode approval gates in your CLAUDE.md:
## Workflow Rules
- Always present a plan before changing more than 3 files
- Never push to remote without explicit user confirmation
- After tests, report results and wait before proceeding
Escalation Patterns
When the AI encounters ambiguity or failure, it should escalate rather than guess. The decision depends on confidence and consequence:
| Low Consequence | High Consequence | |
|---|---|---|
| High Confidence | Self-resolve silently | Inform and continue |
| Low Confidence | Ask for clarification | Stop and report |
Level 1: Self-resolve → AI fixes the error on its own
Level 2: Inform + continue → AI notes the issue, keeps going
Level 3: Ask clarification → AI asks a specific question, waits
Level 4: Stop and report → AI cannot proceed safely, presents options
The Trust Gradient
Trust builds over time. Claude Code supports this through permission allowlists in .claude/settings.json:
{
"permissions": {
"allow": [
"Bash(npm run test)",
"Bash(npm run lint)",
"Bash(git status)",
"Bash(git diff)"
]
}
}
Each allowed pattern is a trust decision. Over weeks of use, your allowlist grows as you gain confidence in the AI’s behavior. Session 1 prompts for everything. Session 50 auto-allows most operations.
Key Insight
The best human-in-the-loop design makes human involvement feel natural, not annoying. The goal is not to minimize human interaction — it is to maximize the value of each interaction.
Bad HITL asks: “May I read this file?” (obviously yes, always) Good HITL asks: “I found 3 approaches. Which do you prefer?” (genuine decision point)
The framework: auto-allow no-risk actions, gate at strategic decision points, escalate when uncertain, and never ask questions with obvious answers. This mirrors how you manage a junior developer — you review their plan and check their PR, not watch them type.
Hands-On Example
A deployment workflow with approval gates:
## Deploy Workflow (in CLAUDE.md)
When I ask you to deploy:
1. RESEARCH (autonomous): check git status, run tests, check deps
2. GATE 1: show test results, git status, warnings. Wait for "proceed"
3. BUILD (autonomous): run production build, verify output
4. GATE 2: show build size, warnings. Ask "Ready to push to main?"
5. DEPLOY: only with explicit "yes" -- run git push origin main
The interaction:
User: "Deploy the latest changes"
AI: [runs tests, checks status -- autonomous]
AI: "Pre-deploy: 47/47 tests pass, 3 files changed, no vulns. Build?"
User: "Yes"
AI: [builds -- autonomous]
AI: "Build: 245KB bundle, no warnings. Push to main?"
User: "Push it"
AI: [pushes, reports status]
Two human touchpoints. Each meaningful. Irreversible actions gated. Reversible actions automatic.
What Changed
| No HITL Design | With HITL Design |
|---|---|
| AI asks about everything or nothing | Strategic checkpoints at decision points |
| Users feel anxious or annoyed | Users feel informed and in control |
| Trust is binary (on/off) | Trust builds gradually via allowlists |
| Failures are silent or catastrophic | Failures escalate at the right level |
Next Session
Session 23 dives into Custom Agents & Skills — how to build your own Markdown-based extensions that encode workflow knowledge into reusable, shareable components.