"You may not modify the test file."
Without this, expert developers find the cheat in under 10 minutes and dismiss the entire loop. It redirects AI optimization from "make it green by any means" to "write a correct implementation." Adding standard TDD procedural instructions without a dependency context map worsened regression rates from 6.08% to 9.94% — worse than no TDD instruction at all. [6]
Vibe-code a feature without tests → add second feature → observe architecture degrade live. AI agents never spontaneously suggest refactoring without test constraints. [7] Show the vibe-coded diff alongside a TDD diff. Expert devs internalize it without argument.
.devcontainer/devcontainer.json with prebuild enabledcurl test snippetcurl snippet returns valid responseExtends classic red-green-refactor with a Plan phase before Red (AI generates implementation roadmap) and a Validate phase after Refactor (human reviews the diff to catch "cheat" tests).
Prepends Example Mapping before the first test. Three autonomy levels — let participants choose and debrief the difference:
| A | AI runs until end of feature | speed mode |
| B | AI runs until end of each RGR cycle | ★ default |
| C | AI runs until end of each phase | max oversight |
| Failure mode | Prevention |
|---|---|
| ⚙ Environment setup in live session Loses 20–30 min; derails all subsequent timings | Codespaces prebuild + mandatory pre-check 24 hrs before [9] |
| 🔑 AI API key failure on day Blocks all exercises; kills workshop credibility |
Pre-provision with expiry; day-before curl test required to claim key
[10]
|
| 📺 Demo-heavy, hands-on-light Expert disengagement within 15 min | Hard rule: ≤7 min explanation before participants touch code; 60–70% of session must be hands-on [14] |
| 🗣 Dominant expert hijacking discussion Others disengage; session follows one rabbit hole | Parking lot + timebox; round-robin debrief format; silent brainstorm before open floor [13] |
| ❓ Exercise too ambiguous Participants stuck; helpers overwhelmed; pacing collapses | Test every exercise solo end-to-end before the session; embed "if stuck" hints as code comments in repo stub |
| 🛠 Tool sprawl Cognitive overload; participants lose their place | One primary tool per task; introduce tools sequentially; avoid simultaneous Zoom + Miro + Slack + IDE [14] |
| 👤 No helper in breakout rooms Stuck participants wait silently; frustration builds | 1 helper per room of 4–6, briefed on exercise goals, arrives in room for first 3 min [11] |
| 🚧 Expert resistance to AI tooling Overt scepticism infects room culture | Address AI limits explicitly in context frame; peer-champion framing; concrete first win in < 5 min [12] |
| ⏰ Overrun debrief, no synthesis time Participants leave with open loops | Hard 10-min closing slot in run-of-show; parking lot absorbs overflow; written recap within 24 hrs |
| ☕ No break in 90-min session Focus degrades in last 30 min; diminishing returns | 5-min break at 00:55, non-negotiable even under time pressure |