FLASHPOINT 01
At what corpus size does in-process vector search (FAISS, LanceDB) break down?
Correct frame: query concurrency, not corpus size alone.
A 10M-vector FAISS index collapses under 50 concurrent users.
A managed Qdrant or Weaviate cluster handles the same load at 100M vectors.
Give the room a concurrency framing — not a raw number.
FLASHPOINT 02
Is contextual retrieval worth doubling the embedding index cost?
The fork: daily-ingesting corpora (cost compounds) vs index-once corpora (cost is one-time).
Surface this as a room vote: "raise your hand if your corpus ingests more than 10,000 documents per day."
The vote makes the trade-off personal and avoids a settled-answer posture.
FLASHPOINT 03
Should evaluation be the last segment, or the very first?
The answer: first. Deploy a minimal RAGAS harness in Segment 1 alongside the naive pipeline.
Re-run at every checkpoint.
The hallucination-rate drop after adding the guardrail is visually undeniable
when you have a graph across all 5 stages — a one-shot before/after score is not.