Essay 4/8 — voice-gate 98/100, scrubber clean — 2026-05-06

Voice-gating and public-content scrubbing

Public output without quality gates is corporate hype. This essay covers the two gates every public-bound output passes: voice-gate (advisory, scored against a documented rubric) and public-content scrubber (hard-blocks financial-data leaks).

The gates are simple Python; the discipline is what matters. A founder who relies on "I'll proofread before publish" produces 30% off-voice output by default; a founder whose pipeline has voice-gate as a hard pre-publish step produces 100% on-voice output. Read this for the rubric weights, the surface calibration, and the allowlist patterns that keep sticker prices visible without leaking realized intake.

Every public-bound output passes through 2 quality gates before it ships.

Voice-gate. A score from 0 to 100 against a documented voice specification (VOICE.md). The specification names banned phrases (vague hype words, AI-cliché openers, em-dash overuse, passive voice ratio thresholds), target sentence length per surface (15 words for default, 10 for product pages, 19 for postmortems, 12 for Telegram), and concreteness requirements (every paragraph must have a number, a date, a backticked identifier, or a two-word proper noun). The default threshold is 70.

Voice-gate is advisory. It surfaces a score and the deductions; the human or agent decides whether to ship. In practice the agent revises until the score clears 70. The voice specification is a single Markdown file; updating the gate is a 10-line edit. The gate is a 200-line Python script. Not magical.

Public-content scrubber. A hard gate against financial-data leakage in public output. The Daily AI Agents scrubber walks the text for dollar amounts, 10 financial-term keywords (the most common ones), customer counts (the N paying users pattern), and platform-specific URLs that imply a private track record. Any hit blocks the publish.

Sticker prices are allowed via 2 allowlist patterns. The PRICING_CONTEXT_AFTER pattern matches $99/mo, $47 ebook, plus sticker-price keywords (max position, hard floor, starter, team, bundle, course) — a dollar amount immediately followed by a recognized keyword. The PRICING_CONTEXT_BEFORE pattern matches keyword-then-amount phrasings (drops to $X, position size at $X, caps at $X).

The Daily AI Agents scrubber is fail-closed. If the regex finds a hit and no allowlist context surrounds it, the publish blocks. False positives are acceptable in the 1-2% range; false negatives leak internal financial figures into public output, which is unrecoverable.

Both gates run on every public surface — the 6 website pages, the public skill bundle README, the Friday ship log, the founder letters, the email bodies of customer-facing notifications. Anything that touches a customer goes through both.

The gates compose. An agent draft might score 62 on voice (below threshold) but pass the scrubber clean. The agent revises the prose, rescores, and ships only when both gates pass.

The discipline matters because it's structural. A founder who relies on "I'll proofread before publish" produces 30% off-voice output by default; a founder whose pipeline has voice-gate as a hard pre-publish step produces 100% on-voice output.

**Chapter 4 summary:** Two gates on every public output. Voice-gate scores against a documented spec; threshold 70. Scrubber hard-blocks financial leaks; allowlist permits sticker prices. Both are simple Python scripts; the discipline is structural, not heroic.

What to read next

← Skills as procedural memory  ·  full paper  ·  The /dashboard pattern →