TL;DR:Skills are user manuals for your agent's tools. Get them wrong and your agent spends more time confused than working.
What is a Skill?
A skill is a user manual for a tool — or a chain of tools.
If the model isn’t powerful enough to figure out tool usage on its own, a skill also includes examples. Think of it like onboarding documentation: “here’s what this tool does, here’s when to use it, here’s a concrete example.”
Unlike a one-time prompt, skills are designed to be read repeatedly. Your agent will reach for them on every relevant task.
The Pile of Manuals Problem
Now imagine your agent has 50 user manuals in front of it. It needs to pick the right one before it can do anything.
Two problems emerge immediately:
1. Ambiguity kills accuracy. If two skills are too similar — say, two different ways to fetch weather data — the model has no reliable way to pick. It’ll guess, and it’ll guess wrong sometimes.
2. Context burns tokens. Loading every skill into the context window is wasteful and degrades focus. The more irrelevant content the model has to wade through, the noisier its reasoning becomes.
Modern agent design spends a lot of effort solving the skill selection problem before skill loading ever happens.
Skill Selection: Index Before Load
The right pattern is: select index, then load skill.
Think about driving a car. You don’t need the manual for how to fix the engine just because you’re making a left turn. If your agent is writing a document, it doesn’t need the stock trading skill loaded into memory.
The goal is:
- Fast — retrieval should not be the bottleneck
- Accurate — wrong skill = wrong tool = failed task
In my implementation, I skip the naive “dump all skills into context” approach and instead use indexed selection — match the task to the right skill before injecting anything.
Skill Selection as Reinforcement
Here’s an interesting insight: skill selection from human behavior is exactly what Meta’s “distill from human” approach does at scale.
When a human expert picks the right tool for a job, that decision carries signal. If you capture those decisions — which skill was chosen, what was the context, did it succeed — you can train a model to make better choices over time.
The data you accumulate from real agent runs becomes a natural fine-tuning dataset. Your agent literally gets better at picking the right skill the more it works.
What’s in a Skill File?
In practice, a skill is a plain text file. It can include:
- Tool definition — what the tool does, its parameters, return values
- Usage instructions — when to call it, what to avoid
- Chaining examples — how to combine it with other tools
- Failure modes — common errors and how to recover
Images work too, as long as your processor model handles multimodal input.
Key Design Principles
- One skill, one job. Overlapping skills cause ambiguity. Deduplicate aggressively.
- Index before load. Never inject skills you don’t need for the current task.
- Skills are maintained, not set-and-forget. APIs change, tools break, better patterns emerge. Treat your skills like code.
- Capture selection signal. Every time your agent picks (or fails to pick) the right skill, that’s training data.
GitHub
The implementation is at github.com/Czhang0727 — skills, selection logic, and the full agent scaffold.
Part 4 covers memory — how agents extend context beyond what fits in the window.
Auth_Verified: 2026.05.10
