PassLane “Coach” — Master Plan

AI learning companion · plan only, no code · generated 2026-06-18

PassLane AI Companion — The Master Plan

Codename: Coach · Principal / Head of Product + Engineering · 2026-06-18

Status: definitive. Supersedes all prior drafts.

Provenance of claims (read this first). Two classes of fact appear below, tagged where it matters. [repo-verified] = checked against /Users/arizona/CLAUDE CODE/passlane on 2026-06-18 (file/line/count confirmed). [API-assumption] = Anthropic API behavior or pricing as of the 2026-01 knowledge cutoff, to be re-confirmed against current docs before the Phase-1 build. The plan is engineered so that no [API-assumption] flipping silently breaks a load-bearing section — each such dependency carries an explicit fallback. We do not claim blanket "everything is verified"; a plan whose moat is honesty cannot afford a single falsifiable boast.

1. Executive Summary

PassLane already turns dead commute time into mastery for a brutal exam — roughly half of insurance-license candidates fail, almost always from under-preparation and skipped state law. Coach is the teacher who rides along: a calm, candid exam instructor who lives inside the existing one-file app, speaks in the same af_bella voice that already reads questions aloud, is silent until summoned, and never says a word it can't trace to a vetted explanation — when the bank doesn't cover something, Coach says so instead of inventing the law. It earns its keep in three postures the learner pulls, never the app pushes: Train (teach a concept), Test (drill weak areas and coach mock exams), Talk (think a question through). The economics that killed Quizlet's Q-Chat are designed out from day one: the entire teaching corpus is pre-generated offline against the fixed bank, human-reviewed, content-hashed, and shipped into local app data — so the high-value path runs fully offline at zero runtime cost and ships free to every learner, with only live open-ended chat metered behind a key-holding edge proxy as the Pro headline.

What we are building, in one sentence: A grounded, voice-first study companion that lives inside PassLane, speaks in its existing voice, and can be summoned to Train, Test, or Talk you toward your license — provably never inventing the law, never leaking an answer during an exam, and never running up an unmetered bill.

Two honest constraints stated up front, because the plan is built around them:

Spoken Coach has a real audio gap to close. [repo-verified] Only 215 of 323 AZ questions have read-aloud clips today; 108 (33%) have none, and no Coach copy is voiced at all. The af_bella generation pipeline is absent from the repo. So the spoken commute is a built deliverable with a named work-stream and cost line (§4.7, §6.3), not an inherited freebie. Text Coach ships first and is fully functional for 100% of the bank.
Offline audio fails for a specific, now-correctly-diagnosed reason (cross-origin service-worker bypass, not a cache refusal), which changes the fix (§4.7).

2. Design Principles (the non-negotiables)

Each is enforced in code or a CI gate, and each killed a tempting alternative.

Grounding buys correctness; model tier buys polish. Every factual claim is tied to a vetted bank explanation via the Citations API. The cheap model (Haiku 4.5) is the default because correctness comes from the retrieved source, not the parameter count. We pay for Sonnet/Opus only where warmth and judgment — not facts — are the value.

Local-first is a hard constraint, enforced in code — not a slogan. Core study sends nothing and needs no cloud. The companion is purely additive: if Coach is unavailable, the existing text+tap study path is byte-for-byte unchanged. The heavy teaching path is pre-generated and ships in the app like the question banks already do (scripts/export-pack.mjs rebuilds the same questions*.json filenames with no index.html change).

Honesty is the moat — made mechanical, not promised. Coach shows its source on every claim; when grounding is thin it refuses and says so. This is the literal UX of "no hallucinations," and it mirrors the shipped voice-out rule [repo-verified, speak(), index.html:2790–2793: "Recordings are the ONLY voice. Never fall back to robotic system TTS"].

Quiet by default. Coach speaks only at four earned moments (you ask, a feedback reveal, you ask to be drilled, a rare threshold-warmth). Per-answer chatter is forbidden by construction — it would regress the codebase's deliberate restraint (warmthTail fires once at the 3rd-miss or 8-streak; the Coach Reveal is neutral, no buzzer). "The AI talks too much" must be impossible, not merely tuned away.

Exam integrity is a wall, inherited from existing gates. Coach is hard-disabled whenever isExam is true — the mic is already hidden [repo-verified, index.html:2737], read-aloud already gated [2809/2839]. "Build mastery, never enable cheating" is enforceable at gates already in the repo, on two surfaces (in-app and the public answer-audio CDN — see §5.4).

The voice contract is frozen behavior. All mic access routes through the single startListening/stopListening chokepoint (ISOLATION RULE #3). The partialResults-resolves-empty quirk is load-bearing. node voice-sandbox/harness.js must exit 0 before and after any change near the listen window.

Pre-generate once, serve forever. The fixed bank means the entire companion corpus is computed offline (Batch API), reviewed, and cached. Runtime LLM cost is effectively $0; live calls exist only for what genuinely cannot be precomputed — a learner's own words.

Build inside the constitution. Own a cx- CSS prefix, one render-region writer, plain JS / no build step (introducing TS or a bundler here crosses the simplicity line and is not warranted). Anchor every edit by symbol, not raw line number — the 287KB single file shifts.

Engine-aware by construction, PassLane first. Coach reads grounding from the same per-exam content packs the STATE_FILE map and D1 verticals→exams→categories→questions schema already model. Prompts, refusal copy, voice-id, and the confusable-map live as per-vertical config. Build and tune for Arizona/insurance first; CDL/NCLEX/real-estate inherit the companion with no code fork.

3. The Companion Experience — Train / Test / Talk

3.1 Persona

One presence, unnamed-feeling (the UI says "Coach," never a mascot, no avatar — brand law is warm, calm, teacher-first, de-cheesed). It is the same af_bella voice as read-aloud, so Coach is the teacher who's been reading you the questions, now leaning in — the seamlessness Speak's users praise and Duolingo Max's "scripted, like free AI" lacks. Diction: short sentences, plain English, names the exact concept and the exact misconception, no "Great job!" filler. Candor is the character — on thin grounding: "The bank doesn't cover that one head-on — here's the closest principle it does teach." A tutor that bluffs on a licensing exam is a liability.

Note on the voice contract: "af_bella" is not a code-enforced constant — [repo-verified] it appears zero times in index.html and exists only as a single top-level "voice":"af_bella" field in app/audio/states-manifest.json (manifest convention, not frozen API). We make it a real contract: the export/voice pipeline stamps and asserts voice === 'af_bella' on every new Coach clip, the way harness.js makes the voice contract real — so nothing can silently ship a clip in a different voice.

Mandatory AI + scope disclosure (an Anthropic AUP contract term for a high-risk vertical, not a flourish) opens Coach's first turn of a session, once, in PassLane's voice:

"Quick note — I'm an AI study coach. I help you learn the exam's answers. I'm not a licensed agent, and this isn't insurance advice. Okay, let's get you ready."

This single line satisfies the AUP disclosure requirement, draws the exam-prep-vs-advice legal line, and meets the FTC honest-AI bar at once.

3.2 Default posture — "Quiet Companion"

The shipped study loop (mode_select → reading → listening → feedback → advancing) is untouched. Coach earns the right to speak in exactly four moments, then returns to silence:

You ASK

Trigger explicit Ask gesture + an off-script question

Coach does Train/Talk, grounded (clip if pre-gen'd & voiced, else text + earcon)

You answered / didn't

Trigger normal feedback reveal

Coach does the existing speakFeedback/revealUnanswered, optionally enriched by pre-gen elaboration

You ask to be QUIZZED

Trigger "drill my weak spots"

Coach does builds a queue via existing Leitner/weakCategories, runs the normal answer flow

A real threshold fires

Trigger existing warmthTail points (3rd-miss, 8-streak, return-after-gap)

Coach does a rare, pre-recorded encourage line — never per-answer

3.3 TRAIN — "teach me this" (mostly offline, FREE)

The highest-ROI, lowest-risk surface, and it ships free because it's local data.

Elaborated feedback — a richer "why the key is right and why each tempting wrong answer is wrong" than the terse bank string. Pre-generated, grounded strictly in the question's explanation+choices+correct. Renders as text instantly offline; spoken when a clip exists (or offline once the clip-cache warms — §4.7). Evidence: elaborated feedback ≈ d.49, roughly 10× bare corrective feedback.
Misconception repair via distractor-as-diagnosis — Coach knows the wrong letter you picked (submitAnswer(letter)) and surfaces that distractor's specific confusion ("you mixed up insurable interest at inception (life) with at loss (property)"), then optionally queues a near-transfer twin from the same category to confirm the repair stuck.
Worked-example fade, hard minority only [repo-verified: difficulty ≥ 4 = exactly 55 of 323]: a deterministic rule on existing fields — box 1–2 = full steps, box 3 = completion problem (Coach sets up, you finish aloud), box 4–5 = no scaffold (respects the expertise-reversal effect). Only the step text is pre-generated; the fade decision needs no model call.

Card [repo-verified, questions.json[0], id pc001, correct D]: For a property insurance policy, insurable interest must exist at what point in time? (A) any time (B) when applied for (C) inception and loss (D) at the time of loss. Learner picks C.

Coach (cites pc001): "Close — C is the classic trap. For property, you only need insurable interest at the time of the loss, so it's D. You're borrowing the life rule, where the interest only has to exist at inception. That inception-vs-loss split is the whole point of this one."

3.4 TEST — "drill me / coach my mock exam" (reuses existing machinery)

Drills run through existing buildQueue + weakCategories + Leitner — Coach is the orchestrator, the shipped quiz loop is the engine. The testing effect (g≈.5–.6) comes free from the answer-before-reveal flow.
Discrimination drills (block-then-interleave): keep blocked practice while a category is brand-new, then juxtapose 2–3 specifically confusable categories (admitted↔non-admitted, HMO/PPO/POS, whole/term/universal — from a content-pack confusable-with map) and ask the learner to state the distinguishing rule. Interleaving's biggest documented win is exactly this similar-between/dissimilar-within material.
Mock-exam coaching is BEFORE/AFTER only — mute DURING. Warm-up before; after scoring, calibration confrontation: predicted-vs-actual readiness, the 2–3 categories the gap lives in, and a refusal to say "ready" until calibrated mastery clears READINESS_THRESHOLD. Lowest performers are the most overconfident and quit early — this raises scores and is maximally on-brand. Enforced at the verified isExam gates.
Explain-back — Coach's signature retrieval loop — ships UNGRADED by default. On ~1-in-5 correct answers Coach asks "in one sentence, why is C right?"; the learner speaks it. The ungraded self-check delivers the full free-recall benefit with zero false-negative risk (a false "you're wrong" on a correct paraphrase is trust-killing for an anxious adult, and Ask-mode drops the A–D speech biasing so mis-transcription compounds the risk). The graded variant — which can hold a card in a lower Leitner box to catch "right for the wrong reason" — unlocks only after a measured precision bar (§5.5).

3.5 TALK — "let's talk it through" (Pro, online, the small metered tail)

Open-ended grounded Q&A, Sonnet 4.6, warm, hard-capped ~2–3 turns, tethered to the question's explanation, text-reply on screen at launch (no system TTS — §9). Socratic is a ≤1-beat scalpel, then a clear answer (RCT evidence: Socratic-heavy tutoring shows no outcome gain and feels withholding to time-pressured adults). Opus 4.8 is reserved for a single premium end-of-session mock-exam diagnostic that reasons across the whole miss-pattern. Never when isExam. Latency budget and failure behavior are specified in §3.9.

3.6 The voice-first loop — two postures over one mic lifecycle

The shipped listen window is frozen behavior tuned for SHORT utterances [repo-verified: taskHint=.confirmation, contextualStrings=['A'..'D'], VOICE_SILENCE_BUDGET_MS=5000, VOICE_HARD_CAP_MS=12000 an absolute per-question ceiling, NATIVE_RESTART_COOLDOWN_MS=400 guarding the teardown race]. A conversation needs the opposite. So:

ANSWER mode = the existing window, untouched. Harness stays green.
ASK mode = a separate, explicitly entered posture (its own generation counter, extended silence budget, no .confirmation hint, no letter-priming, explicit end-of-turn), signaled by a distinct earcon and mic color so the user always knows which mode they're in.
The Ask gesture (a small cx-ask button, or long-press the mic) calls stopListening() to cleanly endpoint, then enters ASK mode via the same chokepoint. It reads the same partialResults channel — start() stays fire-and-forget; the companion never touches the silence watcher or the answer-path counters.
Routing is surgical: the "talk to it" branch plugs into the one place every utterance is classified — [repo-verified] the ABSTAINED fall-through of processVoiceMatches (index.html:3784) — but only when askMode === true. In pure answer mode, a fell-through utterance stays the shipped "Didn't catch that" copy. We never auto-route a mis-heard answer into a chatbot.

Decisive scope call — spoken multi-turn Ask is spike-gated; TEXT-ASK is the Phase-3 default and ships regardless. A long ASK window that reopens the mic mid-question is exactly the reopen-cycling the 5s budget was introduced to kill (the budget replaced a 14s one for this reason), and it collides with the absolute 12s per-question cap and the teardown race. Multi-turn dictation on the local speech fork is untested. So text-ASK (an "Ask Coach…" field, same grounding, same citations) is the conversational tier on day one of Phase 3, with zero long-mic risk. Spoken ASK ships only after an on-device spike proves, on real iOS and Android, that (a) the engine doesn't bail mid-sentence, (b) reopening doesn't trip the 2-sessions race, and (c) ASK runs on a non-question generation id with its own ceiling decoupled from the 12s cap. The spike is scheduled against docs/IOS-VOICE-TEST-PLAN.md [repo-verified to exist].

3.7 Memory — read, don't store

All cross-session continuity is a READ over state that already persists locally (az_ keys: Leitner box, per-category accuracy, miss/right streaks, sessionMissed). Coach already knows what you keep missing — it just isn't speaking from it yet. The return-opener is specific and grounded — "Last time, claims-made vs occurrence tripped you twice — want to start there?" — sourced entirely from local data. Persist at most a tiny local cx_memory object (last topic, last weak categories, last-seen date) in localStorage. No transcripts, no PII, no third party. This is the highest-trust, lowest-risk feature and it ships first — and it doubles as the struggle-signal source that lets us honor the no-tracking promise without adding analytics.

3.8 Graceful text fallback (an equal citizen, not a downgrade) — including the unvoiced third

Because the entire pre-generated corpus renders into the existing feedback-expl DOM region with no model call and no network, text is first-class: silent study, sound-off, no-mic/permission-denied, the web build, and offline all degrade to a clean text experience. Every Coach claim — spoken or text — carries an inline "source: Q pc001" chip; when Coach has no vetted answer it shows "I don't have a vetted answer for that yet" rather than inventing.

The 33%-unvoiced case is a day-one UX state, not an edge case [repo-verified: 108 AZ questions have no clip]. Defined behavior when Coach would speak but the question has no base clip and no Coach clip yet:

Coach renders its full text immediately (identical teaching content), and
the read-aloud/Coach speaker control shows a quiet "text only for this one" affordance (no error, no dead button, no "voice coming soon" vaporware) — consistent with the existing graceful-degradation idiom.
The question id is added to the voice-coverage backlog the af_bella pipeline drains (§4.7/§6.3), so coverage rises release over release.

This makes the experience whole for 100% of the bank from the first ship, with audio as progressive enhancement.

3.9 Live-path latency & failure budget (Plane B felt-responsiveness)

Cost caps (§4.6) govern spend; this governs feel for an anxious commuter:

Time-to-first-token target: stream the reply; first token visible ≤1.5s p50 / ≤3s p95 on the Sonnet path (prompt-cache warm). The UI shows an immediate "Coach is thinking…" state with the source chip already pinned, so the grounding is visible before the prose.
Streaming, not blocking: tokens render as they arrive; no spinner-then-dump.
Worker timeout / cold-API behavior: if first token doesn't arrive within 8s, or the Worker/Claude call errors, fail closed to Plane A — show the local grounded explanation ("Coach is offline right now — here's the vetted explanation") instead of an error. A failed live turn never costs the user a turn against their budget.
Hard turn cap UX: at the ~2–3 turn ceiling, Coach closes warmly and offers a drill, never abruptly cuts.

4. The Intelligence Architecture

4.1 Two planes

Plane A — Coach (offline, every learner, ships first): per question, an elaborated rationale + per-distractor misconception/repair + a short Socratic follow-up, grounded ONLY in that question's explanation/choices/correct. Batch-pregenerated, human-reviewed, content-hashed, shipped into app data, and (where the pipeline has voiced it) pre-rendered to af_bella audio. Runtime cost $0, network none.
Plane B — Tutor (live, Pro, post-launch): open-ended Talk + explain-back grading + mock-exam debrief, via a key-holding sibling Worker, gated behind server-verified entitlement, a spend budget, default-OFF consent, and an updated privacy posture.

4.2 Retrieval — ranked selection, NOT a vector DB

The relevant unit is a single known question — the one on screen, or the top keyword/category hit. Retrieve that question + its same-category siblings, never a whole bank. A vector DB at launch would violate zero-cloud-to-launch, add latency, and add infra to operate. Written upgrade trigger: add embeddings only when (a) a future vertical's sibling set overflows a sane prompt budget, or (b) the eval harness's recall@k drops below bar (§5.2). The eval gate forces the upgrade; we don't pre-build it.

4.3 Citation & grounding — the hallucination firewall

Each retrieved explanation is one Citations API custom-content block, so cited text is token-free and composes with prompt caching and Batch. Claude returns block-level citations that render as the source chip. A system-prompt refusal contract ("answer ONLY from the provided explanations; if absent, say so and offer to drill a related concept; never invent statutes, numbers, deadlines, or state-specific law") plus a runtime guard (zero citations on a factual claim → suppress the spoken reply, show "no vetted answer yet") make the moat mechanical. [API-assumption: Citations API block-level behavior and ZDR-eligibility per 2026-01 docs; confirm before build.]

4.4 Generation is a TWO-PHASE pipeline (with a documented collapse-path)

[API-assumption — load-bearing, confirm first: Citations is incompatible with Structured Outputs (400 error), and toggling citations invalidates the tools cache.] If that holds, a single grounded-and-structured call is impossible, so:

Phase 1 — Generate (grounded prose): Haiku 4.5 + Batch, Citations on, source = the question's explanation + neighbors. Output is plain elaborated prose with citation spans. No structured constraint, no tools.
Phase 2 — Judge (containment check): a separate call (structured output on, citations off) that takes each generated sentence + its cited source and returns {claim_supported, introduces_new_fact}. Any sentence introducing a fact not in the source is dropped or routed to human review. This catches the worst-case licensing failure — confident over-synthesis from correct sources (a fabricated rule a student repeats on the exam) — and it's caught offline, before dissemination, the way voice-sandbox/harness.js proves the voice contract: correctness asserted, not vibed.

Fallback if the incompatibility is lifted: if a future API revision lets Citations and Structured Outputs co-exist, the two phases collapse into one grounded-and-structured call — the containment check becomes an output field rather than a second pass, halving pre-gen cost. The architecture is therefore not brittle to this fact flipping; the judge survives as a CI gate regardless (§5.2), since we still want an independent containment assertion even if generation is structured.

4.5 Model routing & caching

All grounded explain/quiz/why + all Plane A pre-gen

Model Haiku 4.5 [API-assumption: $1/$5 per M, cache-read 0.1×]

Why Correctness is from grounding, so the cheap model is the default

Live Talk + explain-back grading

Model Sonnet 4.6 [$3/$15]

Why Warmth and open-ended judgment

End-of-session deep diagnostic

Model Opus 4.8 [$5/$25]

Why Rare premium ceiling, reasons across the whole miss-pattern

Cache at the CATEGORY/whole-bank prefix level, not per-question. [API-assumption: Haiku 4.5 cacheable-prefix minimum = 4,096 tokens.] [repo-verified] AZ explanations average ~37 words (~50 tokens); full per-question context (stem+choices+explanation) averages ~79 words (~105 tokens) — both far below a 4,096-token floor, so a single question's grounding can never clear it. For offline Batch the discount is moot (one pass). For the live path, cache the stable system prompt + the state bank as one shared prefix ([repo-verified] AZ's full bank ≈ 40,993 tokens clears the floor comfortably) so every live call reuses one big cached context — that's where the 0.1× actually pays.

4.6 Worker proxy, key custody & metering

A sibling of worker/src/index.js, reusing its bearer-auth + KV rate-limit + CORS + fail-closed pattern [repo-verified ~lines 84–110], holding ANTHROPIC_API_KEY as a wrangler secret — never in the 287KB client bundle. Zero Data Retention enabled on the live route [API-assumption: Citations is ZDR-eligible; Batch is not — fine, Batch only ever processes the fixed bank, never user data]. It forwards only question_id + the text turn. Fits the Worker free tier (~100k req/day) because the heavy path is offline.

Metering is a HARD pre-req of Plane B, not a footnote. [repo-verified] Pro is a localStorage boolean (isPro(), index.html:2148), the Worker's only identity is a shared token + a client-asserted device-id — both trivially rotated, and a heavy talker (~40 Sonnet turns/day ≈ ~$9.75/mo at assumed pricing) sinks the $59.99/yr ≈ $5/mo plan. Before Plane B ships: (1) server-verify the store receipt (Google Play / App Store) and mint a per-install signed token — the client flag may gate UI but never gates spend; (2) hard per-user budgets (daily turn cap + monthly token ceiling) that degrade gracefully to Plane A when spent ("You've used today's deep chat — here's the grounded explanation"); (3) a global kill-switch + spend ceiling that fails closed; (4) treat device-id as untrusted and cap globally so mass rotation can't exceed the budget.

4.7 Offline & the audio truth (corrected diagnosis + honest coverage)

The flagship "study out loud on your commute, offline" scenario is not real today and we will not claim it before it is. Two distinct facts, both [repo-verified]:

(a) Why offline audio fails — corrected root cause. AZ audio streams from a cross-origin Pages CDN (AUDIO_CDN = https://passlane-5jv.pages.dev, index.html:4270; clipSrc() = ${AUDIO_CDN}/audio/${name}.mp3, index.html:4293). The service worker does cache .mp3 cache-first (sw.js:56, the isCacheFirst branch) — but it never sees these requests, because the fetch handler bails on cross-origin at the top: if (url.origin !== location.origin) return; (sw.js:34). (The earlier draft's "sw.js refuses to cache mp3" was wrong; the cause is the cross-origin early-return, which changes the fix.)

The fix, and its constraint — an Owner fork (§9):

Option A (recommended): serve audio same-origin. Move /audio under the app origin (or proxy clips through the Worker). Then the existing cache-first SW logic just works — no new IndexedDB code, no CORS dance. This also dovetails with the D1 schema, which [repo-verified] already documents "Audio references that map to private R2 object keys" — i.e., the infra was designed for app-owned/private audio, so this is aligned with intent that already exists, and it simultaneously closes the CDN integrity hole (§5.4).
Option B: keep cross-origin + add an IndexedDB blob clip-cache. Then it is mandatory that the CDN send permissive CORS (Access-Control-Allow-Origin), blobs are fetched with mode:'cors', and a spike proves a cached clip survives airplane-mode. More code, more failure surface.

We pick one explicitly before Phase 1; the recommendation is A because it reuses shipped SW behavior and resolves integrity in the same move.

(b) Coverage — the honest numbers. [repo-verified] On disk: 430 AZ clips = 215 -q + 215 -a; states-manifest.json enumerates 215 pc/es ids (manifest voice: af_bella). Therefore *~66% of AZ questions can read the question aloud; ~33% (108) cannot read anything, and 0% of Coach copy is voiced. The af_bella TTS pipeline is absent from the repo*. Consequences, made explicit:

Phase 1 ships two honest tiers: Text Coach (fully offline once Option A lands; $0; 100% of the bank) and Spoken Coach (only where a clip exists; grows as the pipeline drains the backlog).
The voice-coverage work-stream (voice the 108 unvoiced AZ questions + all new Coach copy) is a first-class roadmap item with a cost line (§6.3), not a footnote. The CRAWL "spoken Coach" success criterion is gated on that pipeline existing — until it does, the unvoiced third is text-only by the §3.8 rule.

Install-size impact [measured from repo]: adding rationale + 3 distractor explanations roughly doubles per-bank text (AZ ~256KB → ~0.5MB; six states → low-single-digit MB) — an accepted install add. Audio is the real storage cost and is bounded explicitly in §6.4. We do not inherit a parked offline-audio problem silently; we name it, price it, and gate on it.

4.8 Data-flow (ASCII)

OFFLINE — one-time, Plane A (the free path)
   vetted question bank
    → SME audit (state-law first) + content-hash
    → GENERATE: Haiku 4.5 + Batch, citations on → grounded prose
    → JUDGE: a separate check drops any sentence that adds a fact not in the source
    → human SME review → ship into the app
       (text for 100% of the bank; af_bella audio where voiced)

ON DEVICE (app/index.html)
   mic → one listen chokepoint → classify the utterance:
      • a letter (A–D)        → the normal answer flow (unchanged)
      • "ask" + ask-mode on   → Coach answers, grounded and cited
   exam in progress → Coach is fully disabled (hard wall)
   Train / Test → local pre-generated text (offline) → spoken if a clip exists

EDGE — Pro, online only (Plane B)
   verify store receipt → per-install token → per-user + global caps + kill-switch
    → Claude (Haiku / Sonnet / Opus) → any uncited claim is suppressed
    → if it errors or is slow, fail closed to the local grounded explanation

5. Reliability & Trust

5.1 Hallucination defense (the stack, not a setting)

Grounding (Citations over vetted explanations, no open web at runtime) → explicit "I don't know" permission in the prompt → the Phase-2 containment judge that fails the build on any introduced statute/number/citation → a runtime guard that suppresses uncited claims → strictest grounding + mandatory human review for state-law items. The corpus being finite is converted from a limitation into a trust feature.

5.2 The evaluation / accuracy harness (how we prove correctness)

A Node vm/assert/exit-1 CI gate, modeled on voice-sandbox/harness.js, over a golden question→grounded-answer set, asserting: (1) every answer cites a correct source block; (2) the cited explanation actually contains the claim [the Phase-2 judge, run as a gate]; (3) out-of-scope → refusal, and real-world-advice → redirect; (4) no answer leaks under a simulated exam state; (5) the recall@k bar that triggers the embedding upgrade (target recall@k ≥ 0.95 on the golden set; a drop below forces §4.2's vector-DB upgrade). Correctness is a mechanical gate the codebase already lives by.

5.3 Phase 0 — audit the bank itself (the unexamined single point of failure)

Grounding amplifies the source: a wrong vetted explanation becomes a confident, cited, spoken wrong lesson. Coach's correctness ceiling is the bank's. Before Coach ships: an SME review pass over state-law items first (the AZ-specific law categories — highest legal exposure), then the rest by difficulty; stamp every question with last_reviewed (extend the existing D1 version/status columns); and ship a "this looks wrong" report affordance from day one on every Coach response. For the AZ launch bank (323 Qs) this is a full human pass — cheap at that size and it removes all ambiguity for the content that defines first impressions. Pass rubric (the gate is numeric): review is "passed" when an SME confirms 0 factual errors in state-law items and ≤2% factual-error rate across all 323, every flagged item corrected and re-reviewed. This is also the concrete Anthropic-AUP "qualified professional reviews before dissemination" mechanism, enforced by the rule disseminated == passed_review at the content-hash gate.

5.4 Exam integrity — two surfaces, both walled

In-app: Coach globally disabled whenever isExam (inherits the shipped tap-only/no-mic/no-read-aloud gates); the proxy rejects any in-progress-exam request; mock-exam coaching is post-scoring only; a harness scenario asserts Coach is inert during exams.
The off-app wall the team must close — REAL and confirmed. [repo-verified] The public unauthenticated Pages CDN serves the answer/explanation clip at https://passlane-5jv.pages.dev/audio/pc001-a.mp3 (built by clipSrc(), index.html:4293; prefetched at :4326) with sequential ids — the spoken answer key is a ~10-line scrape today, contradicting the Worker's own "audio never public" principle and the D1 schema's "private R2 object keys" note. (Path corrected: it is /audio/<id>-a.mp3, not root <id>-a.mp3; the root path returns the SPA HTML fallback, which is exactly why a careless test would under-rate the risk.) Decisions: (1) all new companion repair/explain audio uses hashed, non-sequential keys (clip/<sha256(exam||id||"repair")>.mp3) so AI answer content is not enumerable; (2) for the existing base answer audio, an owner call (§9): migrate behind the same-origin/auth + rate-limited Worker/R2 path (this is Option A of §4.7 — one move fixes offline caching and integrity) or consciously accept that rationales are public — but regardless, no new clip and no future audio gets a sequential ${id}-a key on a public bucket.

5.5 Explain-back grading — measurable gate before it can ever penalize

Ungraded self-check is the default (zero false-negative risk, full retrieval benefit). Graded mode — which may touch Leitner state — unlocks only after the Sonnet grader hits a measured bar (≥95% agreement, ≤2% false-"wrong") on a fixture of 100+ human-labeled paraphrases, run as an opt-in on-device-only eval that reports an aggregate accuracy number with no transcript stored (so the no-tracking promise holds). Even then: grade generously, accept the concept in any phrasing, never silently demote (always "I'll bring this back"), always offer "I actually meant X." This numeric instrument is the gold standard the §7 success criteria are modeled on.

5.6 Privacy & compliance (the launch gates)

[repo-verified] The zero-transmission claim is asserted in at least three places, including in-app at index.html:1226 ("no accounts and no tracking … does not collect, transmit, sell, or share") plus privacy.html plus the paywall legal link. A live transcript-forwarding proxy makes all of them false at once — simultaneously an App Review 5.1.2(i) rejection risk, a Google Play Data-safety mismatch, and FTC exposure. Therefore:

Plane A stays 100% on-device → "core study sends nothing" stays literally true and is the default. Phases 1–2 ship under the current policy unchanged.
Plane B requires, before it ships: rewriting every no-transmission surface (privacy.html + the in-app block at 1226 + onboarding/terms + the paywall legal line) to name Anthropic as processor with the real retention fact; a one-time in-app consent gate before the first AI turn; on-device STT so raw audio never leaves the device (preserving "we never transmit audio" verbatim); strip name/email, forward only question_id + text.
Analytics/telemetry are banned while "no tracking" stands. The struggle signal is the existing local Leitner state. The only added local surfaces are an aggregates-only most-missed-concept counter (drives pre-gen priority) and a mandatory per-reply "Report this response" control — which Google Play's generative-AI policy requires and is therefore a launch blocker.
Where the report goes in the OFFLINE phases (CRAWL/WALK), where there is no Worker and no network is permitted: the control queues to a local az_reports buffer (question id + the offending Coach text + timestamp) that surfaces in the SME's next review pass — satisfying the reporting requirement without breaking the no-transmission promise. Only in RUN (Plane B, post-consent) does it additionally POST to the new sibling Worker's report sink. [repo-verified] the shipping app makes *zero /api/ calls, so we do not** rely on the parked /api/report; the live sink is the new Worker or a minimal dedicated endpoint.
Age rating stays general-adult: Coach is topic-locked, no end-user open web; answer Apple's age-rating questionnaire honestly (mic/audio = App Functionality only).

5.7 Rollback / kill-path for a bad shipped corpus (the highest-consequence gap)

A pre-generated Coach explanation ships inside the binary. If a wrong, cited, spoken lesson reaches production, content-hashing and the next review cycle are too slow for a licensing exam. So:

Remote per-question Coach-content denylist. The sibling Worker serves a tiny, cacheable denylist of question_ids whose Coach copy is suppressed. The app fetches it opportunistically (cheap, cacheable, fails-open-to-showing-base-bank-only) and, for any listed id, falls back to the terse vetted bank explanation and hides Coach's elaboration + audio until a fixed release lands. This is a server flag that needs no app update to neutralize a specific bad lesson.
Privacy-clean: the denylist is a download of ids the app already has; it transmits nothing about the user, so it is compatible with the Plane-A on-device posture (it is a content update channel, not telemetry).
Trigger: an SME or an inbound az_reports/Worker report escalates an id onto the denylist within hours; the permanent fix (corrected explanation, regenerated artifact, hash bump) follows on the normal cadence.
This closes the "confident cited wrong lesson is live until the next app release" gap explicitly.

5.8 Validating that Coach actually works (outcome measurement under no-tracking)

The pedagogical effect sizes (elaborated feedback d≈.49; testing g≈.5–.6; interleaving) justify the design; they don't prove transfer to PassLane users. Under the no-tracking constraint we still measure outcomes, the §5.5 way:

Opt-in, on-device, aggregates-only readiness self-report. With explicit consent, the app keeps a local before/after readiness or mock-score delta and can surface an aggregate number to the user themselves ("your calibrated readiness moved from X to Y"). No transcript, no per-event upload.
Optional, separately-consented, aggregates-only beacon (owner fork, §9 #3): if the owner ever wants population-level signal, a single coarse aggregate (e.g., readiness-delta bucket) behind its own consent — never on by default, never per-event, never identifying.
This converts success from "it ships and is correct" into a testable "it moves readiness," without touching the no-tracking promise by default.

5.9 Honesty as moat

Every defense above is also a marketing asset no incumbent can match: every answer traced to a vetted explanation; refuses rather than invents; refuses to leak an answer during an exam; a wrong lesson can be killed remotely within hours; your questions are never used to train AI and are deleted within ~7 days. This is the credible opposite of ExamFX's refund/guarantee grievances and the antidote to the 33–79% hallucination rates that plague general AI tutors — the brand's "honesty is the moat," made operational.

6. Cost Model at Scale

The architecture's whole point: marginal AI cost approaches zero because the expensive work is computed once, offline, and shipped. This is precisely the economics ("per-user inference ate the margins") that killed Quizlet's Q-Chat — designed out. All dollar figures use [API-assumption] pricing (2026-01); the structural conclusion (near-zero, one-time, single-digit dollars) is robust to reasonable price drift and is what matters.

6.1 One-time pre-generation (Plane A) — arithmetic shown

[repo-verified counts: AZ = 323; all six banks = 3,392 (CA 583 + FL 641 + NY 667 + NC 588 + TX 590 + AZ 323).] Per question ≈ 600 tokens in (stem+choices+explanation+prompt) / ~340 tokens out (elaborated rationale + 3 distractor repairs). Haiku 4.5 Batch [API-assumption: $0.50 in / $2.50 out per 1M].

AZ only (323)

Input cost 0.19M → ~$0.10

Output cost 0.11M → ~$0.28

One-time total ≈ $0.40–$2

All six states (3,392)

Input cost 2.04M → ~$1.02

Output cost 1.15M → ~$2.88

One-time total ≈ $4–6

The two-phase judge pass adds a second Haiku call of similar magnitude; the realistic envelope is single-digit dollars for the whole six-state corpus, one-time. We never anchor on a number a reviewer can falsify in a spreadsheet — the conclusion is what matters and it is robust. (If §4.4's incompatibility is lifted, the judge folds into generation and this roughly halves.) Regeneration on a bank edit is cheap because answers are content-hashed.

6.2 Runtime, at scale

10k

Plane A (offline teaching) $0 — ships in app data

Plane B (live Talk, Pro only) A fraction subscribe; capped per-user; Sonnet whole-bank-prefix cache → a few cents/session

Net Near-zero; comfortably inside Worker free tier

100k

Plane A (offline teaching) $0

Plane B (live Talk, Pro only) Per-user daily/monthly budgets + global ceiling hold the line; degrade to Plane A when spent

Net Bounded by design, not by hope

Plane A (offline teaching) $0

Plane B (live Talk, Pro only) Worker free tier is ~100k req/day; the live tail is a small Pro fraction; if it ever approaches the cap, scale the Worker (still cents per active session) and the budgets cap worst-case

Net Stays near-zero because the heavy path never hits the network

Why it stays near-zero: (1) the teaching corpus is pre-generated and local — the most-used feature costs nothing at runtime; (2) live calls exist only for a learner's own words, a small Pro-gated fraction; (3) the live path is Haiku-default with whole-bank prompt caching at 0.1×; (4) per-user and global spend caps make the worst case bounded, not unbounded; (5) Opus is a rare, server-enforced ceiling. The premium price is therefore near-pure margin that funds premium design. (Anchor the price against exam-prep incumbents — a ~$130 ExamFX seat for 60 days — not against $4 consumer tutors; $59.99/yr never-expiring is the affordable, premium, honest option.)

6.3 af_bella voice generation — a real, priced work-stream (was missing)

Because [repo-verified] 108 AZ questions are unvoiced + 100% of Coach copy is unvoiced and the pipeline is absent from the repo, voicing is a line item, not an afterthought:

Scope, one-time AZ: 108 missing question clips + ~323 elaborated-rationale clips + up to ~969 distractor-repair clips + a handful of scripted warmth/calibration lines. Per-state this scales with bank size.
How: a TTS step wired into export-pack.mjs that (a) renders new copy to af_bella, (b) stamps and asserts voice === 'af_bella', (c) writes same-origin (per §4.7 Option A) with hashed keys for any answer-revealing audio (per §5.4). The specific TTS service/model is an owner/integration decision (it must reproduce the existing af_bella timbre); the cost driver is per-clip synthesis, typically fractions of a cent to a few cents per short clip — low tens of dollars one-time for the full AZ Coach voice set at commodity neural-TTS rates, regenerated only on bank edits.
Gating: spoken-Coach success criteria (§7) do not pass until this pipeline exists and the AZ backlog is drained; until then the unvoiced subset is text-only by the §3.8 rule and fully functional.

6.4 Storage / install-size budget (was unbounded)

Text corpus: [measured] AZ ~256KB → ~0.5MB; six states low-single-digit MB. Accepted.
Audio cache (the real cost): a fully-cached AZ voice set is ~215 question clips + Coach clips ≈ low tens of MB per state (order ~20–40MB AZ once all Coach copy is voiced; commodity ~50–150KB/clip). Hard rules: (1) audio cache is opt-in per state (reuse the existing "pre-save this state's voice" affordance), never silent; (2) a stated ceiling — cap the on-device audio cache (e.g., ≤150MB across states) with LRU eviction so it can't grow unbounded; (3) the shipped install carries text only (clips stream/cache on demand), keeping the binary within App Store / Play norms; (4) surface a "manage offline voice" control showing MB used. The offline-audio promise now has a bounded, user-visible storage cost.

7. The Build Roadmap

Every success criterion below carries a number and an instrument, modeled on §5.5, and references the runbooks that already exist in docs/ [repo-verified: STUDY-SESSION-TEST-MATRIX.md, IOS-VOICE-TEST-PLAN.md, LAUNCH-RUNBOOK.md] rather than vague phrases.

CRAWL — ships first, post-launch (offline, FREE, zero new infra, zero new privacy surface)

Scope: Phase 0 bank audit (state-law SME pass + last_reviewed + "looks wrong" → az_reports local queue) → Memory-opener + TRAIN (text, offline, 100% of bank): pre-generated elaborated feedback + misconception repair at the feedback seam, plus scripted reframing/calibration lines. Then, after the af_bella pipeline (§6.3) exists and Option-A same-origin audio (§4.7) lands, spoken Coach + SW-cached offline audio for the voiced subset, with the unvoiced third in the §3.8 text-only state. Gate a 12-interaction Coach taste mirroring [repo-verified VOICE_FREE_LIMIT=12]; the elaborated feedback itself is free to every learner.

Success criteria (numeric, instrumented):

node voice-sandbox/harness.js exits 0; zero diff to the mode_select→…→advancing call graph (asserted by harness, not eyeballed).
Eval harness (§5.2) passes: citation-present on every claim, Phase-2 containment clean, out-of-scope→refusal, no-exam-leak, recall@k ≥ 0.95.
AZ corpus passes the §5.3 rubric: 0 state-law errors, ≤2% overall factual-error rate across all 323.
Airplane-mode smoke passes on the STUDY-SESSION-TEST-MATRIX.md device list (text path renders, no network calls, no errors) — pass = green on every listed device/OS.
privacy.html + in-app:1226 unchanged and still literally true; grep confirms *zero new /api/ calls** in CRAWL.
Spoken-Coach criterion (gated): voiced subset plays offline after one warm; unvoiced subset shows "text only for this one," never a dead control.

WALK — TEST tier + the conversational wedge

Scope: discrimination drills on existing Leitner/weak machinery; mock-exam before/after coaching (mute during) + calibration confrontation; ungraded explain-back; Ask-mode proven in the harness first (S12+), then TEXT-ASK as the conversational surface.

Success criteria (numeric, instrumented):

Harness S12+ scenarios green: no ASK↔ANSWER state leak, Coach inert during isExam, a late ASK transcript never submits as an answer.
Mock-coaching fires 0 times mid-exam across the test-matrix exam runs (asserted).
Text-ASK answers are 100% grounded + cited on the golden set; uncited claim → suppressed.
On-device long-form mic spike scheduled against IOS-VOICE-TEST-PLAN.md for both iOS and Android before any spoken ASK is greenlit.

RUN — Plane B live tier (Pro, online, the metered tail)

Scope: TALK (Sonnet, text-reply) + graded explain-back (past the §5.5 gate) + the Opus end-of-session diagnostic, behind the key-holding sibling Worker, with the §5.7 remote denylist live. Spoken ASK only if the on-device spike passes.

Success criteria — all blocking:

Server-verified receipt → per-install token live and tested; client flag gates UI only (verified it cannot authorize spend).
Per-user + global budget caps + kill-switch live and tested; a forced over-budget session degrades to Plane A, costs the user no turn, shows the grounded fallback.
Live latency meets §3.9: first token ≤1.5s p50 / ≤3s p95 on a warm cache; 8s timeout → fail-closed to Plane A.
privacy.html + in-app:1226 + terms + paywall rewritten, consent gate shipped, Play Data-safety updated — in the same release.
Per-reply report control wired to a verified-live sink (new Worker); ZDR enabled (confirmed on the route).
Graded explain-back clears ≥95% agreement / ≤2% false-wrong on-device before it touches Leitner.
Remote Coach-content denylist demonstrably suppresses a planted bad id within one app launch.

Tiering law: free gets real teaching (elaborated feedback is local data, so it's free) — Pro is the conversation and live coaching, never "pay to get explanations at all." Coach is the headline Pro unlock — "your instructor, on call" — reusing the existing pw- paywall.

8. Risks & Mitigations

Risk Bank correctness is the true SPOF — grounding amplifies a wrong/stale explanation into a confident, cited, spoken wrong lesson

Severity High

Mitigation Phase 0 SME audit (state-law first) + numeric rubric (§5.3) + last_reviewed + day-one az_reports affordance; correctness bounded by the bank's; disseminated==passed_review

Risk Bad shipped corpus is live until next release — a cited wrong lesson is in the binary

Severity High

Mitigation Remote per-question Coach denylist (§5.7) suppresses a bad id to base-bank-only with no app update; privacy-clean (download, not telemetry)

Risk Exam-key leak via public CDN — [repo-verified] /audio/pc001-a.mp3 is HTTP-200 enumerable with sequential ids; naively keying new AI clips ${id}-a widens it

Severity High

Mitigation Hashed non-sequential keys for all new audio; owner picks §4.7 Option A (same-origin/R2 — fixes integrity and offline in one move) vs accept-rationale-public; never a new guessable -a clip publicly

Risk Runaway live-AI bill — [repo-verified] isPro() is a spoofable localStorage flag; device-id rotatable; Opus unbounded

Severity High

Mitigation Server-verified receipt → per-install token; per-user and global budgets + kill-switch; fail-closed to Plane A; client flag gates UI only

Risk False privacy policy — a live proxy makes the zero-transmission claim ([repo-verified] 3 places incl. in-app:1226) false → App Review + Play + FTC

Severity High

Mitigation Plane A stays offline (policy true); Plane B ships only with every surface rewritten + consent gate + on-device STT + Data-safety update, coupled to the release

Risk Hallucinated / over-synthesized law — fabricated statute/number a student repeats on the exam

Severity High

Mitigation Citations grounding + Phase-2 containment judge (fails build on introduced facts) + runtime uncited-claim suppression + strict state-law human review

Risk Voice-funnel regression — an ASK branch or multi-turn mic breaks the partialResults-empty contract / staleness guards tuned for short answers

Severity High

Mitigation All mic via startListening/stopListening; ASK is a separate posture with its own gen-id/budget; harness S12+ exit 0 before any UI

Risk Offline audio fixed at the wrong layer — [repo-verified] clips bypass the SW via the cross-origin early-return (sw.js:34), not a cache refusal

Severity High

Mitigation Correct diagnosis in §4.7; Option A same-origin reuses the existing cache-first SW (no new code); Option B requires CORS + airplane-mode spike before relying on it

Risk Spoken Coach over-promised — [repo-verified] only 215/323 AZ voiced, 0% Coach copy voiced, pipeline absent

Severity High

Mitigation Text Coach ships for 100% of bank; voicing is a priced work-stream (§6.3) with the unvoiced third in a defined text-only UX (§3.8); spoken criteria gated on the pipeline

Risk Spoken multi-turn ASK may not work on-device — collides with the 12s cap + teardown race; untested on the local fork

Severity Med

Mitigation Spoken ASK spike-gated (iOS and Android, per IOS-VOICE-TEST-PLAN.md); TEXT-ASK ships regardless so the tier doesn't depend on the spike

Risk Explain-back false "you're wrong" silently demoting earned mastery

Severity Med

Mitigation Ungraded self-check is default; graded gated behind ≥95%/≤2% on-device eval; never silent-demote; "I actually meant X" recheck

Risk Store AI-feature rejection — missing reporting / consent / age-rating; report sink wired to a parked route

Severity Med

Mitigation Per-reply report control: local az_reports queue in offline phases, verified-live Worker sink in RUN; consent gate; honest age-rating; reporting treated as a launch blocker

Risk Live-path feels slow for an anxious commuter

Severity Med

Mitigation §3.9 budget: stream tokens, source chip pinned immediately, ≤1.5s/≤3s first-token target, 8s→fail-closed to grounded local text, failed turn costs no budget

Risk Over-instrumentation destroys pace/feel — a regression of shipped restraint

Severity Med

Mitigation Strict frequency caps (explain-back ~1-in-5, warmth one-shot at existing thresholds); everything pull-driven/skippable; never in Exam mode; one-jewel-per-section binds Coach

Risk Stale pre-gen artifacts when the bank is edited

Severity Med

Mitigation Key each artifact to a question content-hash; regenerate on change (cheap via Batch); denylist covers the gap until a fixed release

Risk Offline-audio storage unbounded — a fully-cached voice set is tens of MB/state

Severity Med

Mitigation §6.4: opt-in per state, ≤150MB LRU ceiling, text-only install, "manage offline voice" control

Risk Outcome unproven — effect-size literature may not transfer; no-tracking blocks measurement

Severity Med

Mitigation §5.8 opt-in on-device aggregates-only readiness self-report; optional separately-consented coarse beacon (owner fork) — never on by default

Risk Load-bearing API facts stale — Citations×Structured-Outputs incompatibility, pricing, 4,096 cache floor are [API-assumption]

Severity Low

Mitigation Each tagged; §4.4 has a collapse-path if incompatibility lifts; pricing drift doesn't change the structural near-zero conclusion; confirm against current docs before Phase 1

9. Open Decisions for the Owner

To keep the owner's surface honest, this is split: calls the Principal owns (stated for transparency, not for re-litigation, per the "own the decision after research" discipline) vs genuine forks with real cost/liability tradeoffs and no obvious default.

9A. Decisions made — FYI, won't relitigate

Live Talk voice-out = text-first at launch. Spoken live replies can't be pre-recorded, and the shipped rule is recorded-clips-only; text-first protects the honesty/premium-voice promise. A system-TTS exception for live chat is declined at launch.
Premium tier shape = Opus baked into the existing $15.99/$59.99 Pro as the visible ceiling. The category's loud "premium AI not worth it" verdicts make a separate higher tier risky; Coach is the Pro headline, not an upsell-above-Pro.
Persona = unnamed, calm "voice of the app." Per warm/teacher-first brand law; no mascot, no avatar. This shapes copy and paywall framing, and it is decided.
First-run telemetry = hold the absolute "no tracking" line. Pre-gen priority comes purely from on-device Leitner state; outcome measurement is the §5.8 opt-in on-device self-report. No default analytics.

9B. Genuine forks needing the owner

Audio architecture + integrity (one decision, two payoffs). [repo-verified] the spoken answer key is publicly enumerable at /audio/<id>-a.mp3 today, and offline audio is broken by the cross-origin SW bypass. §4.7 Option A (move/proxy audio same-origin / behind the auth+rate-limited Worker/R2 the D1 schema already anticipates) fixes both integrity and offline caching in one move and reuses the existing cache-first SW — recommended. Option B (keep cross-origin CDN + add an IndexedDB blob cache with mandatory CORS + an airplane-mode spike) is more code and more failure surface. New AI audio uses hashed keys regardless. The Principal recommends A; the owner confirms the infra activation cost.

The qualified reviewer of record. Who is the licensed-insurance SME that signs off the pre-generated corpus and backstops factual-correctness liability and the Anthropic high-risk-review posture — Gino, or a contracted licensed agent? This sets the review-cadence pattern for every future vertical. No default the Principal can pick — it is a liability/credential question.

Launch path for the conversational tier. Ship the high-confidence offline companion + TEXT-ASK first and defer spoken multi-turn until the on-device spike passes (Principal recommends this), or treat spoken "talk to it" as core to the day-one promise and fund the spike up front? Genuine because it trades launch speed against a marquee promise, with real spike cost.

Population-level outcome signal (optional). Hold to purely on-device, user-only readiness numbers (§5.8, default), or authorize a single separately-consented, aggregates-only readiness-delta beacon so the team can see whether Coach moves pass rates at population scale? Genuine because even a coarse beacon touches the "no tracking" brand line and needs the owner's explicit blessing.

Supporting workspace: /Users/arizona/CLAUDE CODE/passlane/docs/companion/ [repo-verified: exists, empty]. The plan above is the document; no code was written and recon was read-only, per scope.

Private working document — unlisted, not indexed. PassLane / Somos LLC. ↑ top