The through-line: composition, not selection
The four extension primitives don’t compete — they nest. A plugin is the packaging unit; it can bundle skills, subagents, slash commands, hooks, and pre-configured MCP servers in one directory installed via /plugin ([1], [2]). A skill with context: fork runs inside a subagent ([3]). A hook can fire on a subagent’s tool call ([4]). The deep-dive should teach the layer cake first, then the comparison table — otherwise the audience will hear four overlapping pitches and pick whichever the speaker demos last.
The cleanest mental hook from the harness brief: subagents give isolation, hooks give determinism (only PreToolUse blocks), the rest is substrate ([4], [5]). Ask: context problem, determinism problem, or defaults problem?
The Skill-vs-MCP live wire — don’t duck it
Simon Willison’s framing — “I expect we’ll see a Cambrian explosion in Skills which will make this year’s MCP rush look pedestrian by comparison” ([6]) — is the one tension expert audiences will already be arguing about. The workable heuristic the children converged on: MCP when you need a long-lived stateful connection (DB session, OAuth handshake, SaaS API); Skill when “run this CLI, read the output” is enough; both can ship inside the same plugin. Skills are also now a cross-vendor open standard — Cursor, Copilot, Codex, Gemini CLI and 30+ agents read the same SKILL.md ([7]) — so portability has flipped in their favour.
The trust boundary is the session’s reason to exist
This is session 3 in a series whose second instalment was security. Don’t paper over the inheritance — every extension layer has both a high-leverage distribution mechanism and a fresh CVE history:
- MCP: 110M+ SDK downloads/month under Linux Foundation governance ([8], [9]), but
mcp-remoteCVE-2025-6514 (437k+ downloads, RCE), Postmark BCC supply-chain, Smithery breach (3,000+ servers), and a 2026 scan finding ~200,000 vulnerable instances exposed ([10], [11]). - Skills: Snyk’s ToxicSkills audit found 36.8% of skills with at least one flaw, 13.4% critical, 91% of malicious skills embed prompt injection to bypass safety — and the dynamic-context
!`cmd`feature runs before the model sees the skill, so model-level defences never fire ([12]). - Plugins: CVE-2025-59536 (RCE via hook config before the trust dialog), the “TrustFall” pattern (cloning a hostile repo executes code), and marketplace dependency-hijack PoCs ([13], [14], [15]).
The unifying frame to land on a slide is Willison’s lethal trifecta — private data + untrusted instructions + an exfiltration vector ([16]). Every layer above makes assembling it accidentally easier.
Shape of the 90 minutes
The delivery-plan child argues for 75 min content + 15 min buffer, 8–12 min active blocks, monologue capped at ~7 min, mid-session recap at 0:45, and a cold-open callback to sessions 1 and 2 ([17], [18]). Three live demos earn their slot: build a FastMCP tool + Inspector ([19], [20]), mkdir → plugin.json → SKILL.md → --plugin-dir to ship a plugin in one breath ([21]), and a sandboxed tool-poisoning reproduction ([16]). Pre-record each as a fallback.
The decision-framework child suggests locking weights before scoring candidates and breaking ties on runnable-demo viability ([22]) — applied here, the trust-boundary block keeps its slide count because it’s the only block that compounds on the prior session’s audience.
Open question to leave on stage
With plugins as the packaging layer, MCP under Linux Foundation governance, and SKILL.md now a cross-vendor format, the harness boundary has moved from “what Claude Code knows” to “what the marketplace ships.” The registry-as-trust-root is the only realistic answer to supply-chain attacks ([10]), but it isn’t solved yet — so the honest question to hand the audience is whether session 4 should be agent-to-agent (A2A) coordination ([23]), or a hard look at whether any of this is auditable enough to put on the critical path.