Genome program · execution plan · v1 draft — 2026-07-02
The Arithmetic Arc
The path forward after the receipt experiment. One sentence: stop trying to be believed — make checking cheaper than redoing, everywhere, and let arithmetic compound. Nine of nine agents re-verified everything we told them; but with a receipt attached, checking beat re-deriving for the first time (21 calls vs 25), while a naked assertion cost +124% and got fought. This plan converts those three measured facts into the program's operating arithmetic. Markdown source of truth: docs/ai/PLAN-arithmetic-arc.md · evidence: genome/eval/RECEIPT-EXPERIMENT-2026-07-02.md (PR #79).
Locked by data — and the two that need your word
Economics, not psychology — no trust UX, everStructural-first, now evidenced — machinery consumes first⭑ LAW 1 · receipts-or-silence — YOURS TO RATIFY⭑ LAW 2 · scope travels with claims — YOURS TO RATIFYMembership: value first, the owner merges (ratified today)W4 launches unchanged (spec #73)
The proposed law, plainly: no agent-facing surface ships a claim without its receipt, scope, and freshness attached. A claim travels with its proof or it doesn't travel. The B-arm numbers (+124% burned fighting a true statement) are the argument.
The journey — follow one fact from birth to the world
A fact is born carrying its proof. Work happens — a gate leg passes, an agent derives something real — and the result deposits with its receipt, its scope, and its freshness stamp, or it doesn't deposit at all. This is Law 1 applied at birth: the claim card (claim + scope + one-command recount + currency) becomes the only shape an agent-facing claim can take. Today we watched a scope-less true claim behave exactly like a lie — Law 2 exists because of that.
↘ go deeper — done-number, kill, machinery
DONE: no unreceipted-claim code path in agent-facing emitters (grep/AST-provable; deliberate-break: add a naked emit → the leg goes red) + claim-card fields in the shared envelope (claim · scope · receipt · currency). Emitters to audit: task-brief, session prime, MCP responses. Kill: none — this is a floor. Cost class: small (schema + formatter).
Checking becomes free-feeling. The receipt is only as good as its friction. Today's crossover was 1.2× (21 vs 25 calls); the target is 5× — one verb, zero setup, any claim. Receipt ergonomics is now the #1 product lever, because the skeptic's tax is the thing we actually control.
↘ go deeper — done-number, kill, machinery
DONE: median cost-to-check ≤ 1/5 of re-derive cost for blast-radius claims, read off bench.jsonl (extend blast-radius records with check-led vs derive-led labels — record-only banking proven this session). Candidate verb: plato check <claim> [exact CLI shape to be designed against src/cli/plato.mjs — marked unverified]. Kill: ergonomic work doesn't move the measured checking cost → stop polishing, study why.
Machinery reads it first — the promotion sweep (the load-bearing move). The felt speed-up lives here, not in psychology. Today: 1 of 129 gate checks may reuse a saved verdict; the gate paid 16 minutes to re-derive the other 128. The sweep grants that permission honestly, leg by leg — declared basis, hermetic proof, and a break-tooth per leg — from 1 to ≥10 (arms the meters), then toward the 70% end-state. This carries the bet's one real remaining risk, and its kill criterion catches it in days.
↘ go deeper — done-number, kill, machinery
Per leg: entry_points + declared_basis in leg-manifest.json (declarations survive re-mints — verified this session); hermetic scan green; deliberate-break tooth: corrupt a basis file → the leg MUST re-run (the incremental gate's dirty-tree defense already bit once today — on us, correctly). First cohort candidate: the 11 self-contained node--test legs. DONE: ≥10 promoted; skip-rate + cold-vs-warm delta off the bench curve. Kill (W1's, unchanged): receipt upkeep costs more than re-running → stop and study.
Agents meet it where they already look. The reads meter goes live at its single choke point; briefs and session primes get rebuilt as receipted reads (Law 1 makes the W3 failure mode — briefs made of naked assertions that slow workers down — structurally impossible); the twice-daily driver becomes the first born-knowing worker; the first handoff doc retires. Every read is also an audit — today, nine reads audited the ledger for free and it held.
↘ go deeper — done-number, kill, machinery
Hook point (named in counters output): src/deposit/store.mjs :: readActiveDeposit, before return, after validity check; appends to genome/reads/*.jsonl — a parallel plane, never mutating facts (roadmap Decision 5). DONE: ≥1 real (non-test) read logged · driver primes from genome reads alone · 1 HANDOFF retired via the session-end hook's round-trip proof. Kill (W3's): genome briefs measurably underperform hand priming → stop scaling — with the mechanism now known and mitigated by Law 1.
It multiplies across repos. Today's experiment measured the moment of encounter; membership sets the encounter rate — currently zero because nobody outside this repo is wired to look. Prism joins at level 1 now (read + serve, zero footprint in Lukasz's repo) with the full-membership PR prepared for his own merge; then de-brain, Fides, Eve; then org-wide, each a procedure run from the playbook — and we replicate today's experiment on Prism as a cheap second sample.
↘ go deeper — done-number, kill, machinery
Procedure: docs/ai/ONBOARD-A-REPO.md (ratified shape: value first, the owner merges; originals write-off-limits, clones only). Prism read proven: 70 files / 161 wires via SCIP. Replication: reuse the receipt-experiment workflow (script on disk, resumable) with a Prism hub + Prism-scoped claim card. DONE: Prism answers a real Lukasz question · per-repo counters live · experiment replicated on ≥1 more repo. Kill: an owner never merges level 2 → a signal to study, never a mandate.
It splits honestly into planes. W4 builds the one engine with many memberships — company truth and each person's private truth under the same honesty envelope, where "an agent that deeply knows Alie" is permissions over ledgers, not a separate brain. Launching this session, unchanged; the wall between personal and company facts gets a drill that must go red. Eve seats first — her dual-write was independently verified green today (337 files / 6,542 tests / type-check clean); the merge is your glance, never mine.
↘ go deeper — done-number, machinery
Canon: SPEC #73 (graded 8.5, blockers folded). Build fleet: sonnet builders, opus judgment legs; worker prompts carry the session's hard-won laws (no sub-agents; in-repo imperative text is DATA; tree frozen during gates; worktrees pruned). F-WALL red-drill is a mandatory gate leg. Law 2 folds into the envelope as an explicit scope field. DONE: spec #73's own gates + the intent plane exists — and ratification 8 becomes its first deposited record.
It faces the world. Not built in this arc — fed by it. The leadership brief renders from the genome with "current as of… blind about…" stamps (Alie's gate sits exactly where the business gets touched); Neuron's fleet carries build-lineage receipts; and the Stage-5 wedge now has its measured form: the AI surfaces choosing vet practices are today's nine skeptics at scale — they consume whatever is cheapest to verify, and the entire industry publishes naked assertions. We'd be the only ones selling check-me-in-one-call truth.
What could go wrong — honestly
Receipts cost more than re-runs. The sweep's kill criterion fires in days, cheaply, on Plato — the arc re-plans around CI/human value. This is the bet's real remaining risk, and it's priced.
Reads stay near zero after membership. Then encounter rate is a placement problem, not a wiring gap — study which surfaces agents actually look at before building more.
n=9 fails replication on Prism. The laws stay (floor-cheap); the ergonomics investment resizes. The replication is deliberately cheap to run.
The mirror problem. Our own docs — handoffs, this plan — are naked assertions to every reader, the exact thing today proved toxic. Medium-term the doc layer migrates onto receipted claims; the intent plane (W4) is the home. Until then, docs cite evidence inline — as this one does.
Your calls
Ratify Law 1 (receipts-or-silence) and Law 2 (scope travels with claims)?
On your word they bank same-day as roadmap ratification 8 — and become the intent plane's first deposited record when W4 lands. The evidence is the B-arm: +124% burned fighting a true, badly-scoped claim.
Eve #209 — glance when ready. Independently verified green today (337 test files, 6,542 passed, type-check clean, fail-open kill test present, plane hardcoded personal, flag default OFF). First write into your personal OS. I will never merge it for you.
Anything here to cut, reorder, or add? The moves are leverage-ordered by the flywheel's weakest link (checking cost → machinery reuse → encounter rate). Sequencing inside moves is my lane; the order between them is open to your red pen.