Every code change passes through a 12-gate checklist before it ships. This essay covers what each gate catches, what triggering incident put the gate on the list, and why the discipline is dumb on purpose. The alternative is not "be more careful" — the alternative is "ship 12 things, 4 of which are broken."
Read this if you are bringing an LLM-based agent into a codebase and want to know why the structural-gate approach beats the heroic-vigilance approach. Each gate is a one-line mechanical check. None requires judgment. All prevent specific failures the system has seen.
Every code change in a Daily AI Agents session passes through an 11-gate checklist before it ships. The checklist is at docs/build-prompt-checklist.md in the public repo and reads:
1. Read these before any code (architecture audit, session-boot doc, recent decisions).
2. Grep-verify any referenced Hermes config key, frontmatter field, or CLI flag before writing it.
3. Doctor green before code (the scripts/doctor.sh health check passes).
4. Per-commit Rollback: line.
5. owner_agent frontmatter on every skill.
6. Source patches require explicit Cooper approval.
7. Final report ≤ 30 lines, format-fixed.
8. Skills not Python scripts (the 3-question gate: is the new code an agent capability? is it called by agents? is it doable in bash + curl + jq? if all three, it's a skill, not a script).
9. Acceptance via actual smoke, not declaration.
10. Half-shipped clean > rushed broken.
11. Never sell the OUTPUT of an unproven internal capability.
The checklist exists because each gate names a class of failure that wasted at least one session in the v17 → vFINAL session arc. Gate 2 caught nine fictional primitives — config keys the agent assumed existed but didn't. Gate 8 caught dozens of cases where the agent was about to write a Python script for something that should have been a skill. Gate 11 caught the trading-as-a-service product page that promised outcomes the system couldn't back.
The discipline is dumb on purpose. Every gate is a one-line check the agent can do mechanically before writing a line of code. None of them require judgment; all of them prevent specific failures the system has seen.
The alternative is not "be more careful." The alternative is "ship 12 things, 4 of which are broken, surface them as bugs to the founder over the next two weeks." The checklist is what makes the difference between those two outcomes.
A related discipline: every session starts with a tight prompt that names the ships, the priority order, the hard-stops. Cooper writes the prompts; the agent executes them. The prompt is the contract; the checklist is the structural guarantee that the contract gets honored.
The combination is reproducible. A new contributor — human or agent — can read the prompt, read the checklist, execute the work, and produce shippable output without prior context on the project. That property is what makes the system durable beyond Cooper's involvement.
**Chapter 7 summary:** 11-gate checklist on every session before code ships. Each gate prevents a specific failure mode the system has seen. Discipline is structural, not heroic. The combination of tight prompts plus mechanical checklist makes the work reproducible.
← Founder UX: sessions, delegation, deep-work · full paper · Where this goes →