← Default view
Session 4 of 4  ·  Expert Developer Audience  ·  1–2h Virtual

Extending Claude Code — Authoring Craft & Operating at Scale

73 citations 4 research threads 36 min read expedition 2026-06-03
THE description FIELD IS THE ROUTING INTERFACE

All three extension layers share a single activation mechanism: the description field. It is the activation signal in SKILL.md[1], the tool-selection mechanism in MCP server schemas[2], and the dispatch signal for subagent auto-invocation[3]. The session should open here. Make participants write one before the first break.

Skills — activation signal MCP — tool selector Agent SDK — dispatch signal
"Getting a description wrong is not a documentation problem — it is a routing failure that compounds across every session."
Craft rules: what + when, max 1024 chars, always third person, must be testable by the writer.[17]
Layer 1

Skills

Startup cost
~100
tokens at startup (name + desc only)[16]
  • Body loads on demand — only when skill is relevant
  • 30–50 tokens per skill until invoked[4]
  • Teams run 20–50 skills simultaneously at negligible overhead
  • Non-trigger root cause: almost always the description
  • Scope: project, user, or enterprise level
  • disable-model-invocation: true for side-effect workflows
Layer 2

MCP Servers

5-server baseline
50k–66k
tokens before first prompt[6]
  • Playwright 3,442 tok · Gmail 2,640 · Jira ~17,000 · GitHub 8k–12k
  • MCP Tool Search (Jan 2026): up to 95% reduction[5]
  • Auto-enables when tool defs >10% of context window
  • ENABLE_TOOL_SEARCH=auto:5 lowers threshold to 5%
  • 13,200+ tokens recovered in measured sessions
  • Anti-patterns: god-tools, auth-after-build, over-broad scope
Layer 3

Agent SDK

Billing separation
Jun 15
2026 — SDK credit splits from plan quota[13]
  • claude -p --bare for CI — skips all config discovery[12]
  • --bare becomes default for -p in a future release
  • Subagents: isolated fresh contexts, no nesting[3]
  • Dynamic workflows: 16 concurrent, 1000 total/run[24]
  • Hook priority: deny › defer › ask › allow
  • Python ⭐ 7.2k  ·  TypeScript ⭐ 1.5k
Always Loaded
CLAUDE.md
unconditional · keep under 200 lines · everything else should be a Skill
Startup (name + desc)
~100 tok
per skill · 20–50 skills · body deferred[16]
Demand Loaded
skill body
only when Claude decides the skill is relevant[4]
← ALWAYS IN CONTEXT NEVER UNLESS NEEDED →

The pattern is consistent across all three extension layers — treat it as a first-class design principle, not an optimisation tip. Subagent descriptions front-load the routing decision so the spawn prompt never enters context unless the parent decides to dispatch. Corollary: if it can be a Skill, make it one. CLAUDE.md is only for invariants that apply to every interaction.

Server-Side (well-known)
Tools
Executable functions the LLM invokes · action enablers · readOnlyHint enables concurrent dispatch at ~2× rate · destructiveHint triggers confirm dialogs
Resources
Application-controlled context (files, schemas, data) · URI-addressable · list_changed notifications · roots replace allowed-directories[25]
Prompts
Server-curated reusable templates · user-invoked · render as slash commands in IDE clients · the most-skipped primitive after resources
Client-Side builders skip
Sampling Claude Code: ✗
Nested LLM call inside server feature · 200–800ms latency · fits intent classification & validation, not voice (<300ms) · SEP-2577 proposes deprecation[7]
Elicitation Claude Code: ✗
Pauses execution → client renders native form → schema-validated JSON back · three outcomes: accept / decline / cancel · never request secrets[8]
Roots
Client-declared URIs scoping server operations · informational (not strictly enforced) · roots/list_changed allows scope updates without reconnecting
Support Matrix — June 2026
VS Code / GitHub Copilot Claude Code Claude Desktop

Elicitation shipped in the June 2025 spec; neither sampling nor elicitation available in Claude Code as of June 2026. Check clientCapabilities at runtime — this is a live contradiction worth naming in the session.[8]

Live Threat The description field is both the routing interface (§0) and the attack surface. Same field. Use this to close the arc from Session 1.
72%
Peak attack success rate
(MCPTox, 45 live servers)[10]
>60%
Success rate across
popular agents
(353 authentic tools tested)
200k
Vulnerable MCP instances
(2026 disclosure)[11]
Tool Poisoning Attack Chain
malicious description field LLM treats manifest as authoritative executes embedded directives user sees expected output

Rug-pull variant: trusted tool updated post-approval. Manifests are not version-locked at install time.[9] Session closing provocation: when agents are both MCP clients and servers in the same pipeline, do these defences still compose?

Defences
Manifest pinning and signing at install
Allowlists with explicit version locks
Semantic content scanning before consumption
Single-purpose servers (reduced blast radius)
OAuth 2.1 · scope minimisation per tool
Audit log every tool call (CSA guidance)
Enterprise Benchmarks[14]
per dev, active day ~$13
per dev, monthly $150–$250
90th pct daily <$30
agent teams vs standard
June 15, 2026 Billing Separation[13]
15 June 2026
Agent SDK usage moves to a separate monthly credit. Interactive use is unaffected.
Scope: Python SDK · TypeScript SDK · claude -p · GitHub Actions
Pro
$20
/ month
Max 5x
$100
/ month
Max 20x
$200
/ month

The 2026 MCP roadmap removes stateful session IDs to enable stateless horizontal scaling[15], and the Tasks extension (SEP-1686) enables async agent-to-agent communication via MCP. When agents become both MCP clients and MCP servers in the same pipeline, what does the tool-poisoning threat model look like?

The defences developed for human-to-agent flows may not compose cleanly when the "user" approving a manifest is another agent. Leave this open. The session earns the right to not answer it — just to name it.