Human-in-the-Loop

確認フロー、承認ゲート、エスカレーションパターンの設計方法。

2026年3月20日 · 17 分で読む

学ぶこと

監視なしに行動するAIエージェントは危険です。すべてのステップで許可を求めるAIエージェントは煩わしいです。最良の Human-in-the-Loop 設計はその中間に位置し、適切なタイミングで適切な情報を提供して人間を関与させます。

このセッションを終えると、以下を理解できるようになります：

自律性のスペクトラムと各モードの適用場面
自然に感じられる確認フローの設計方法
承認ゲート：AIが人間の判断を待つチェックポイント
AIが行き詰まった際のエスカレーションパターン
信頼グラディエント：制限から始めて徐々に開放する方法

課題

2つの極端な例を考えてみましょう：

Extreme 1: Fully Manual
  User: "Fix the bug"
  AI: "I'd like to read auth.ts. May I?"
  User: "Yes"
  AI: "I see the bug. May I edit line 42?"
  User: "Yes"
  AI: "May I run the tests?"
  User: "Yes"
  → Correct but exhausting. 10 minutes for a 30-second fix.

Extreme 2: Fully Autonomous
  User: "Fix the bug"
  AI: *reads 50 files, edits 12, runs deploy, pushes to prod*
  User: "Wait, I didn't mean--"
  → Fast but terrifying. One wrong assumption and you're debugging the debugger.

本当の答えはその中間にあります。しかし、どこでしょうか？それはアクション、リスクの大きさ、信頼レベルによって異なります。

仕組み

自律性のスペクトラム

┌──────────────────────────────────────────────────────┐
│                Autonomy Spectrum                      │
│                                                       │
│  Manual    Guided     Semi-Auto    Fully Auto         │
│    │         │           │            │               │
│    ▼         ▼           ▼            ▼               │
│  Every     Plan        Auto for     Everything       │
│  action    first,      reads +      runs without     │
│  needs     confirm     searches,    any prompts      │
│  approval  execution   prompt for                    │
│                        writes                        │
│                                                       │
│  claude    claude      claude       claude            │
│  --plan    (default)   (default)    --dangerously-    │
│                                     skip-permissions  │
│                                                       │
│  ◄── More oversight          Less oversight ──►       │
│  ◄── More friction           Less friction  ──►       │
│  ◄── Lower risk              Higher risk    ──►       │
└──────────────────────────────────────────────────────┘

人間が関与すべきタイミング

すべてのアクションが同じリスクを伴うわけではありません。重要なのは、影響度に応じた監視を行うことです：

アクションタイプ	可逆性	リスクレベル	アプローチ
ファイル読み取り	あり	なし	自動許可
コードベース検索	あり	なし	自動許可
ローカルファイル編集	あり（git）	低	プロンプトまたは許可
テスト実行	あり	低	許可リストで許可
任意のコマンド実行	場合による	中	ユーザーに確認
ファイル削除	部分的	中	常に確認
Git push	なし	高	常に確認
本番デプロイ	なし	高	常に確認 + 最終確認
データベース変更	なし	重大	複数ステップの確認

原則：可逆性が自律性を決定する。 git checkout で元に戻せるならば、自動許可できます。本番環境の状態を変更する場合は、人間が承認する必要があります。

確認フロー

Claude Code のパーミッションシステムは確認フローです。AIが承認を必要とするツールをリクエストすると、ユーザーにプロンプトが表示されます：

┌──────────────────────────────────────────────┐
│                                               │
│  Claude wants to run:                         │
│                                               │
│    bash: npm run build                        │
│                                               │
│  [Allow]  [Deny]  [Allow for this session]    │
│                                               │
└──────────────────────────────────────────────┘

良い確認プロンプトには3つの特性があります：

明確なアクション — 何が起こるかを正確に示す
十分なコンテキスト — AIがなぜこれを行いたいのか
迅速な判断 — ユーザーが深い分析なしにイエスかノーを言える

承認ゲート

承認ゲートは、AIが続行する前に計画を提示して一時停止する設計されたチェックポイントです。パーミッションプロンプト（ツールごとに発生）とは異なり、承認ゲートはワークフロー内の戦略的な一時停止です。

┌──────────────────────────────────────────────┐
│              Approval Gate Pattern             │
│                                               │
│  Phase 1: Research (autonomous)               │
│  ├── Read files                               │
│  ├── Search codebase                          │
│  ├── Analyze dependencies                     │
│  │                                            │
│  ▼                                            │
│  ╔══════════════════════════════════════════╗  │
│  ║  GATE: Present findings + proposed plan  ║  │
│  ║  "I found 3 issues. Here's my plan..."   ║  │
│  ║  User reviews and approves/modifies      ║  │
│  ╚══════════════════════════════════════════╝  │
│  │                                            │
│  ▼                                            │
│  Phase 2: Implementation (semi-autonomous)    │
│  ├── Edit files (per CLAUDE.md rules)         │
│  ├── Run tests                                │
│  │                                            │
│  ▼                                            │
│  ╔══════════════════════════════════════════╗  │
│  ║  GATE: Show results before deployment    ║  │
│  ║  "All tests pass. Ready to push?"        ║  │
│  ╚══════════════════════════════════════════╝  │
│  │                                            │
│  ▼                                            │
│  Phase 3: Deploy (human-controlled)           │
│  └── Push only with explicit approval         │
│                                               │
└──────────────────────────────────────────────┘

CLAUDE.md に承認ゲートをエンコードできます：

## Workflow Rules

- Always present a plan before making changes to more than 3 files
- Never push to remote without explicit user confirmation
- After running tests, report results and wait for approval before proceeding

エスカレーションパターン

AIが曖昧さや失敗に遭遇した場合、推測するのではなくエスカレーションすべきです：

┌──────────────────────────────────────────────┐
│            Escalation Ladder                  │
│                                               │
│  Level 1: Self-resolve                        │
│  AI encounters error → reads error message    │
│  → tries alternative approach → succeeds      │
│                                               │
│  Level 2: Inform and continue                 │
│  AI hits a non-blocking issue → reports it    │
│  → continues with main task                   │
│                                               │
│  Level 3: Ask for clarification               │
│  AI faces ambiguous requirement               │
│  → asks specific question → waits             │
│                                               │
│  Level 4: Stop and report                     │
│  AI cannot proceed safely                     │
│  → explains situation → presents options      │
│  → waits for human decision                   │
│                                               │
└──────────────────────────────────────────────┘

AIは2つの要素に基づいてレベルを決定します：確信度と影響度。

	影響度：低	影響度：高
確信度：高	Level 1：自己解決	Level 2：報告して続行
確信度：低	Level 3：明確化を求める	Level 4：停止して報告

信頼グラディエント

信頼は時間とともに構築されます。適切に設計されたシステムは制限的な状態から始まり、徐々に開放します：

Session 1:  ████░░░░░░  Default mode, prompt for everything
Session 5:  ██████░░░░  Allowlisted common commands
Session 20: ████████░░  Most operations auto-allowed
Session 50: ██████████  Team has full confidence, minimal prompts

Claude Code はパーミッション許可リスト（.claude/settings.json）を通じてこれをサポートしています：

{
  "permissions": {
    "allow": [
      "Bash(npm run test)",
      "Bash(npm run lint)",
      "Bash(git status)",
      "Bash(git diff)"
    ]
  }
}

許可された各パターンは信頼の判断です：「このアクションを十分に見てきたので、信頼できる。」

重要なポイント

最良の Human-in-the-Loop 設計は、人間の関与を煩わしくなく自然に感じさせます。 目標は人間のインタラクションを最小化することではなく、各インタラクションの価値を最大化することです。

悪い HITL の質問：「このファイルを読んでもいいですか？」（明らかにイエス、常に）良い HITL の質問：「3つのアプローチを見つけました。どれを選びますか？」（真の判断ポイント）

フレームワーク：

リスクがなく意図が明確なアクションを自動許可する
人間の判断が価値を加える戦略的な判断ポイントでゲートを設ける
AIが確信しているときではなく、不確実なときにエスカレーションする
明らかな答えがある質問は決して聞かない

これは、ジュニア開発者の管理方法と同じです。彼らがすべての文字をタイプするのを見守るわけではありません。計画をレビューし、PRをチェックし、行き詰まったときに質問に答えます。

ハンズオン

以下は、CLAUDE.md にエンコードされた承認ゲート付きのデプロイワークフローです：

## Deployment Workflow

When I ask you to deploy:

1. RESEARCH PHASE (autonomous)
   - Check git status (any uncommitted changes?)
   - Run the full test suite
   - Check for dependency vulnerabilities

2. APPROVAL GATE 1: Pre-deploy report
   - Show: test results, git status, any warnings
   - Wait for my "proceed" before continuing

3. BUILD PHASE (autonomous)
   - Run production build
   - Verify build output

4. APPROVAL GATE 2: Final confirmation
   - Show: build size, any build warnings
   - Explicitly ask: "Ready to push to main?"
   - Only proceed with explicit "yes"

5. DEPLOY (with confirmation)
   - git push origin main
   - Report deployment status

AIがこのワークフローに従うと、インタラクションは次のようになります：

User: "Deploy the latest changes"

AI: [reads files, runs tests, checks status -- all autonomous]

AI: "Pre-deploy report:
     - Tests: 47/47 passing
     - Git: 3 files changed, no uncommitted work
     - Dependencies: no vulnerabilities
     Ready to build?"

User: "Yes, go ahead"

AI: [runs build -- autonomous]

AI: "Build complete:
     - Bundle size: 245KB (down from 260KB)
     - No warnings
     Ready to push to main?"

User: "Push it"

AI: [pushes to main]
AI: "Deployed. Cloudflare build started."

2つの人間のタッチポイント。それぞれが意味のあるものです。ユーザーは可逆的な操作をマイクロマネジメントすることなく、不可逆的なアクションの制御を維持できます。

変更点まとめ

HITL設計なし	HITL設計あり
AIがすべてを聞くか、何も聞かない	判断ポイントでの戦略的チェックポイント
ユーザーが不安または苛立ちを感じる	ユーザーが情報を得てコントロールしていると感じる
信頼がバイナリ（オン/オフ）	許可リストにより信頼が段階的に構築される
失敗がサイレントまたは壊滅的	失敗が適切なレベルでエスカレーションされる
すべてのセッションが同じに感じる	自信の成長とともに自律性が増す

次のセッション

人間とAIの境界の設計方法を理解しました。セッション23ではさらにカスタムエージェントとスキルに踏み込みます。ワークフローの知識、専門的な振る舞い、チーム標準を再利用可能で共有可能なコンポーネントにエンコードする、Markdownベースの独自の拡張機能を構築する方法を学びます。