Skills & Knowledge Loading

Understand Claude Code's two-layer knowledge injection — how skill names are loaded upfront while full definitions are deferred until needed.

March 20, 2026 · 16 min read

What You’ll Learn

Claude Code has a rich set of skills — from code review to test running to debugging. But loading all skill definitions into the system prompt at once would waste precious context tokens. The solution: a two-layer injection system.

By the end, you’ll understand:

How skills and agents extend Claude Code’s capabilities
The two-layer loading pattern (names upfront, bodies on demand)
How CLAUDE.md files provide project-specific knowledge
The knowledge hierarchy and resolution order

The Problem

Claude Code can have dozens of skills (specialized prompts for specific tasks) and agents (specialized subagent configurations). Each skill’s full definition might be 500-2000 tokens. Loading all of them into the system prompt would consume 10,000+ tokens — roughly 10% of context — before the conversation even starts.

But the AI needs to know what skills are available so it can use them when appropriate.

How It Works

The Two-Layer System

Layer 1: System Prompt (always loaded)
┌────────────────────────────────────────┐
│ Available skills:                       │
│ • commit — Git commit with quality      │
│ • review-pr — Review a pull request     │
│ • workflow — 5-step development flow    │
│ • test-runner — Run and analyze tests   │
│ • ...                                   │
│ (names + 1-line descriptions only)      │
└────────────────────────────────────────┘

Layer 2: On-Demand (loaded when invoked)
┌────────────────────────────────────────┐
│ [User types /commit]                    │
│                                         │
│ Full skill body injected:               │
│ • Step-by-step instructions             │
│ • Formatting rules                      │
│ • Examples                              │
│ • Edge case handling                    │
│ (500-2000 tokens)                       │
└────────────────────────────────────────┘

Skill Definition Format

Skills are defined as Markdown files with frontmatter:

---
name: smart-commit
description: Conventional Commits with quality checks
user_invocable: true
---

# Instructions

When the user asks to commit...
1. Run git status
2. Analyze changes
3. Draft commit message following Conventional Commits
4. Run pre-commit checks
...

The frontmatter (name, description) is loaded in Layer 1. The body (instructions) is loaded only when the skill is invoked.

Knowledge Hierarchy

Claude Code assembles its knowledge from multiple sources, in priority order:

Priority (highest → lowest):
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

1. User message (current prompt)
2. Conversation history
3. Project CLAUDE.md (in the repo)
4. Project rules (.claude/rules/*.md)
5. User CLAUDE.md (~/.claude/CLAUDE.md)
6. System prompt (built-in)
7. Skill/Agent definitions (on demand)

This means a project’s CLAUDE.md can override default behaviors, and user-level configuration provides personal preferences across all projects.

CLAUDE.md as Project Knowledge

The CLAUDE.md file is the most important knowledge injection point for project-specific context:

# Project: My E-commerce App

## Tech Stack
- Next.js 14 with App Router
- PostgreSQL with Prisma ORM
- Stripe for payments

## Development Commands
- `pnpm dev` — start dev server
- `pnpm test` — run tests with Vitest
- `pnpm db:migrate` — apply database migrations

## Conventions
- Use server components by default
- API routes in app/api/
- All prices stored in cents (integer)

This tells Claude Code exactly how to work with your project — what commands to run, what conventions to follow, where things are.

Agent Definitions

Agents work similarly to skills but define specialized subagent behaviors:

---
name: code-reviewer
description: Expert code reviewer for quality, security, and best practices
tools: [Read, Grep, Glob, Bash]
---

Review the code for:
1. Security vulnerabilities (OWASP Top 10)
2. Performance issues
3. Code quality and maintainability
...

When the parent agent spawns a code-reviewer subagent, these instructions become the subagent’s system prompt.

Key Insight

The two-layer system is a token budget optimization. It’s the same principle behind lazy loading in web development — don’t load what you don’t need yet.

This design enables Claude Code to have a rich ecosystem of skills and agents (50+) without paying the context cost upfront. The AI knows what’s available (Layer 1) and can access full details when needed (Layer 2).

For your own projects, the key takeaway is: put project knowledge in CLAUDE.md. It’s the most efficient way to give Claude Code the context it needs. A well-crafted CLAUDE.md is worth more than dozens of back-and-forth messages explaining your codebase.

Hands-On Example

Create a minimal skill definition:

# File: .claude/skills/deploy.md
---
name: deploy
description: Deploy to production with safety checks
user_invocable: true
---

## Deploy Workflow

1. Run the test suite: `pnpm test`
2. If tests pass, build: `pnpm build`
3. Check for uncommitted changes
4. If clean, deploy: `pnpm deploy`
5. Verify deployment health check

## Safety Rules
- NEVER deploy if tests fail
- ALWAYS check for uncommitted changes first
- Ask for confirmation before deploying to production

This skill can be invoked with /deploy and provides structured guidance for the deployment process.

CLAUDE.md Design Tips

# DO: Be specific and actionable
"Use pnpm, not npm. Run tests with pnpm test."

# DON'T: Be vague
"Follow best practices for the project."

# DO: Include error handling guidance
"If pnpm install fails, try deleting node_modules and pnpm-lock.yaml first."

# DON'T: Include obvious things
"JavaScript files end in .js" (Claude already knows this)

What Changed

Loading Everything	Two-Layer Loading
10,000+ tokens in system prompt	~500 tokens for names/descriptions
Slow initial load	Fast startup
Wastes context on unused skills	Full body only when invoked
Hard to scale skill count	Scales to 50+ skills easily
Project context via conversation	Project context via CLAUDE.md

Next Session

Session 6 covers Context Compaction — what happens when the context window fills up, and how Claude Code’s 3-layer compression strategy keeps conversations going without losing critical information.