RUN-OF-SHOW · BINDER COPY · DRAFT v3 SESSION 03 OF 04 · DEEP-DIVE SERIES RECORDED 2026-05-23

Extending
Claude Code —
MCP · Skills · Plugins

A 90-minute virtual deep-dive on the four extension primitives, the marketplace ecosystem around them, and the trust boundary the audience is already arguing about.

Live Demos
× 3 Pre-record each as fallback

Runtime

90min live + 15 buffer

Active blocks

8–12min each, monologue ≤ 7

Citations

85across 6 expedition children

Depth

expedition3 survey · 3 recon children

§1

The through-line

COLD-OPEN · 00:00 – 00:15

Verdict The four extension primitives don't compete — they nest. Teach the layer cake first, the comparison table second. Otherwise the audience hears four overlapping pitches and picks whichever you demo last. ^[1]^[2]

L4 · packaging

Plugin One directory installed via /plugin. Bundles skills, subagents, slash commands, hooks, and pre-configured MCP servers. defaults problem

L3 · capability

Skill + MCP A Skill loaded on demand from SKILL.md — or a long-lived MCP server. Same plugin can ship both. SKILL.md is now cross-vendor. capability problem

L2 · isolation

Subagent A Skill with context: fork runs inside a subagent.^[3] Fresh context, returns a summary. context problem

L1 · determinism

Hook Fires on tool calls — including a subagent's. Only PreToolUse can block.^[4] determinism problem

L0 · substrate

Config Permissions, settings, allowlists.^[5] Everything above sits on this. defaults problem

§2

Run-of-show · 90 min

75 MIN CONTENT + 15 BUFFER

Time code	Block	Cue / why
00:00 – 00:05	Cold open · callback to S1 + S2opener animation reuse	60-sec visual recap of S1 (AI) and S2 (Security). Series replays compound across episodes — re-anchor returning viewers in the first beat.^[17]
00:05 – 00:15	Frame the problem + live pollpassive-to-active pivot	poll "Which extension primitive have you shipped to prod?" Average viewer checks email after 10 min of passive content — interrupt before they do.^[18]
00:15 – 00:30	Concept block 1 · the layer cakeplugin → skill+mcp → subagent → hook	Walk §1 stack top-down. Land the question "context problem, determinism problem, or defaults problem?" Monologue cap ≤ 7 min then chat prompt.
00:30 – 00:45	Demo A · FastMCP + Inspectorlive build, ~12 min	demolive Build a FastMCP tool, run Inspector against it. FastMCP ⭐ 14k. Show-don't-tell beats slide-heavy stretches.
00:45 – 00:50	Mid-session recapattention dip · halfway mark	3-bullet recap of the layer cake. Tease the trust trifecta. Attention drops at 0:45 — recap pulls it back.^[17]
00:50 – 01:00	Concept block 2 · trust boundaryS2 callback compound	The session's reason to exist. Three primitives, three CVE timelines — see §3. This is the block that compounds on S2's audience habit.
01:00 – 01:12	Demo B · ship a plugin in one breathmkdir → plugin.json → SKILL.md → --plugin-dir	demolive Four commands, one breath. Then demo C: sandboxed tool-poisoning repro. Cued recording ready.^[21]
01:12 – 01:22	Synthesis + Q&Aco-host runs chat	Q&A lifts retention ~32% vs. no-Q&A.^[17] Co-host filters and surfaces. Speaker stays in flow.
01:22 – 01:30	Wrap · S4 teaser · continuity bridgecapture cohort while attention is hot	Hand the open question to the audience (§7). Tease S4 — A2A coordination, or an auditability hard look. Continuity offer at end converts 25–40% of completers.^[17]

§3

Trust dossier · the reason this session exists

BLOCK 50:00 – 60:00

Do not paper over Session 2 was security. Every extension layer has both a high-leverage distribution mechanism and a fresh CVE history. Each row below is a slide.

AUTHZED · 2026

MCP

Surface: 110M+ SDK downloads / month · Linux Foundation governance^[8]^[9]

CVE-2025-6514 · mcp-remote RCE, 437k+ downloads^[10]
Postmark · supply-chain BCC exfil^[10]
Smithery · breach affecting 3,000+ servers^[10]
~200k vulnerable instances exposed (2026 scan)^[11]

SNYK · TOXICSKILLS

Skills

Surface: cross-vendor SKILL.md · 30+ agents read the same file^[7]

36.8% of skills have at least one flaw^[12]
13.4% critical · 91% of malicious skills embed prompt injection^[12]
!`cmd` dynamic-context runs before the model sees the skill — model-level defences never fire^[12]

CHECK POINT · 2026

Plugins

Surface: marketplace install + hook config trust dialog^[1]

CVE-2025-59536 · RCE via hook config before the trust dialog fires^[13]
TrustFall · cloning a hostile repo executes code^[14]
Marketplace dependency-hijack PoCs against the install flow^[15]

§4

The unifying frame

SLIDE · ONE

Simon Willison · the lethal trifecta

Three ingredients. Any agent that has all three is a credential.

Ingredient 1

Private data the agent can read

Ingredient 2

Untrusted instructions in the context

Ingredient 3

An exfiltration vector

= data theft, made accidentally easier by every extension layer above ^[16]

§5

The live wire · Skills vs MCP

EXPECT THIS QUESTION

I expect we'll see a Cambrian explosion in Skills which will make this year's MCP rush look pedestrian by comparison. — Simon Willison, Oct 2025

Workable heuristic → ask which problem you actually have. Both can ship inside the same plugin.

Use…	When…	Because
MCP	Long-lived stateful connection — DB session, OAuth handshake, SaaS API	You need a server that holds state across calls^[19]
Skill	"Run this CLI, read the output" is enough	Markdown + bundled scripts; loads on demand^[3]
Both	You want one install surface	Plugins bundle skills and pre-configured MCP servers^[1]

§6

Live demos · shot list

3 EARNED SLOTS

DEMO A · 12 min00:30 – 00:42

FastMCP + Inspector

Build an MCP tool from scratch in Python, point the Inspector at it, watch the JSON-RPC traffic.

pip install fastmcp
python server.py
npx @modelcontextprotocol/inspector

Tools · FastMCP ⭐ 14k · Inspector ⭐ 5.4k

DEMO B · 7 min01:00 – 01:07

Plugin in one breath

Four commands. Ship a working plugin live without leaving the terminal — to land the "packaging unit" argument viscerally.

mkdir my-plugin && cd my-plugin
echo '{...}' > plugin.json
echo '---' > SKILL.md
claude --plugin-dir .

Reference · code.claude.com/plugins

DEMO C · 5 min01:07 – 01:12

Tool-poisoning repro
SANDBOXED · NO LIVE TARGETS

Reproduce Willison's prompt-injection-via-MCP-tool-description trick in a sealed VM. The audience needs to see the trifecta, not read it.

# sandbox VM only
# no network egress
# pre-recorded fallback cued

Reference · simonwillison.net

§7

Briefing dossiers · supporting research

PREP READING · BACKSTAGE

DOC · 01

MCP deep-dive

Talk-prep brief for a 1–2 hour deep-dive on the Model Context Protocol — architecture, 2026 spec, ecosystem, security trifecta, live-demo recipes.

survey30 citations · 10 min

DOC · 02

Claude Code Skills

A Skill is a Markdown file + optional bundled scripts that Claude loads on demand. Cheaper than CLAUDE.md, more discoverable than a slash command, lighter than a subagent — now an open cross-vendor standard.

survey13 citations · 5 min

DOC · 03

Plugins & the Marketplace

What plugins actually are, how the official + community marketplaces work, the few that earn their keep, and the trust footgun to flag in any 1–2 hour deep-dive.

survey21 citations · 10 min

DOC · 04

Subagents, hooks, and the rest of the harness

One-page mental model: subagents (isolation), hooks (determinism), and the config substrate (skills, permissions, settings).

recon6 citations · 2 min

DOC · 05

Comparison & decision framework

Score candidate topics on four axes — audience fit, series continuity, speaker readiness, demo viability — with weights chosen before scoring. Skip RICE; it's product-feature shaped.

recon7 citations · 2 min

DOC · 06

Session delivery plan

A 90-minute run-of-show with 8–12 min active blocks, mid-session recap, and series-continuity callbacks tuned for the third session in a deep-dive series.

recon8 citations · 2 min

§8 · Hand to the audience

The harness boundary moved from "what Claude Code knows" to "what the marketplace ships."

Registry-as-trust-root is the only realistic answer to supply-chain attacks ^[10] — and it isn't solved yet. So the honest question to leave on stage: should session 4 be agent-to-agent (A2A) coordination, or a hard look at whether any of this is auditable enough to put on the critical path?