B Babelio Appendix & Audit 5/5 PASS
Quality rubric · Cross-link audit · Glossary

The playbook, audited against itself.

What "good" means for each of the 22 files, when to refresh them, the 5 cross-file consistency rules with PASS/FAIL evidence, the decisions made during generation, and every acronym used inline.

Bar for good — the 7 tests Why this is worth $2K, not $0.

1

Populated, not blank

Every cell has Babelio's real number; every competitor is named.

2

Decisions made, with rationale

Pricing is $12/mo and here's why — not "consider three models."

3

Sequenced, not categorized

Week 1 produces artifacts 1-3; founders pay for order of operations.

4

Output-grade, not input-grade

The CAC payback formula is populated, cells flagged for monthly refresh.

5

Linked, not isolated

ICP feeds the pitch feeds the model. Change one number → flag stale others.

6

Has a bar-for-good rubric

Each artifact has a written acceptance test (the table below).

7

Operational, not aspirational

Discount thresholds, MSA sign-off, what triggers a board call — included.

Per-file rubric 22 files × bar × refresh × owner.

Owner defaults to founder pre-team; re-assign as the org grows (see 15-decisions-raci).

File Bar for good Refresh Owner
00 Strategy MemoNew VP reads in 5 min, speaks coherently about the company in their first meeting.quarterlyCEO
01 ICP BriefA junior SDR opens it Monday and qualifies real accounts without asking the founder.quarterlyCEO
02 Value PropA learner reads the before-state and says "yes, that's my Tuesday night."quarterlyCEO
03 PositioningDifferentiation is a real tradeoff axis customers pick on, not "better UX."quarterlyCEO
04 TAM/SAM/SOMSurvives a partner-meeting interrogation — every cell sourced, every ratio defended.quarterlyCEO
05 PricingA new salesperson quotes any deal Monday — tier, discount, seat band, overage.quarterlyCEO
06 GTM MotionA founder + 1 SDR run it as written; an investor sees the path to $1M ARR.monthlySales
07 Growth LoopPull one lever, the 12-month MRR projection updates from the same formulas.monthlyGrowth
08 RetentionRead this week's cohort against the curve, know in 30s whether we're on track.weeklyProduct
09 ExperimentsA growth lead opens it Monday and starts the top experiment with no questions.weeklyGrowth
10 Fin ModelChange one assumption cell → all downstream cells flag stale.monthlyCEO
11 HiringA non-recruiter founder runs any of 5 searches start-to-offer from this page.quarterlyCEO
12 OKRsNot a blank template — Q1 objectives written with a paragraph defending each.quarterlyCEO
13 CadenceA day-1 exec knows where every conversation happens, runs the Monday review.quarterlyCEO
14 KPI SpecAnyone in the company can self-serve "are we on track?" via 20 defined metrics.weeklyOps
15 RACIA founder defers a recurring decision-type to a documented owner.monthlyCEO
16 Risk RegisterEvery risk has a named owner and a tripwire metric; premortem included.monthlyCEO
17 90-Day PlanFounder's next Monday is already planned; each week has 3-5 outcomes + owner.weeklyCEO
18 Investor Upd.Forward to the investor list without editing more than 5 numbers.monthlyCEO
19 How We OperateReduces onboarding-conversation overhead to days; each value has an anti-example.quarterlyCEO
20 AI AddendumA new engineer understands AI COGS, the eval pipeline and the model lock-in stance.continuousEng
21 FundraiseFounder starts emailing 50 named investors Monday without setting up infra.weeklyCEO
A AppendixFounder can run a quarterly playbook audit using this page alone.quarterlyCEO

Self-audit checklist 7 questions to ask every quarter.

Is every cell populated with a real number?
P0
Is every decision MADE with rationale, not described as a choice?
P1
Are files sequenced correctly for execution?
P2
Do downstream files flag stale when an upstream number changes?
P1
Does each file show a refresh cadence?
P2
Has the founder run this appendix audit in the last 90 days?
P0
Are the anti-goals being respected? (see 19-how-we-operate)
P2

Cross-link consistency audit 5 rules, run on generation — 5 PASS, 0 FAIL.

# Rule Evidence Result
1 Pricing in 05 matches 10 ARPU and 06 script quotes. 05 tiers: $12/mo, $120/yr, $5/hr metered. 10 §assumptions: "ARPU $12/mo." 06 templates quote $12/mo, $120/yr, $144/yr (=12×$12 monthly). All aligned. PASS
2 ICP in 01 matches 02 persona and 21 market description. 01: "serious immersion-method language learners (JA/KO/ZH-first)." 02 persona: "Mei, 24 — immersion-method learner." 21: "the immersion-learner ICP." Same wedge, same wording. PASS
3 KPI list in 14 is the union of OKR KRs (12) + retention metric (08) + financial metrics (10). 14 spans 7 funnels / 20 metrics including W4 retention & aha-rate (08), MRR, paying conversion, WTP-vs-COGS (12), and ARPU/CAC/LTV/burn (10). The union is covered. PASS
4 Risk register P0s in 16 include all P0s from REVIEW.md. REVIEW.md 4 P0s → all in 16: COGS contradicted 5–13× (R4 WTP-vs-COGS), ICP/mode/beachhead collision (R1 wedge), zero user validation (R-validation "<5 paying / <10 LOIs"), founder-market fit absent (R5 audio/OS engineer uncommitted + FMF risk). PASS
5 90-day plan (17) week-1 outcomes include the first experiment from 09. 09 top card = E01 Mom-Test interviews (recruit, metric, MDE, kill-number). 17 W1 outcome = "Mom-Test interviews with immersion learners from r/LearnJapanese, Refold…" — same experiment, same source. PASS

Decision log — seed What was decided during generation. Append as you iterate.

Full RACI and the master log live in 15-decisions-raci.

D-01Lead consumer immersion wedge, not broad "translate any app."

Broad pitch is a race to $0 against free Zoom/extension features; the immersion niche is structurally walled from browser-only rivals. (00, 03)

D-02Hero mode = subtitle / dual-track, not auto-mute-and-dub.

The job is "understand live native speech while keeping the original audio for ear-training." Auto-dub is an optional toggle. (00, 02)

D-03Price the middle tier at $12/mo ($120/yr); metered dub at $5/hr.

Anchored to existing learner spend ($60–180/yr on Migaku/Anki). WTP-vs-COGS gap is the open risk validated before lock. (05, 10)

D-04Treat the OS audio tap as a head-start, not the moat.

It scores 0/7 Helmer — uses public APIs anyone can call. The compounding asset is per-app quality telemetry. (00 bet 2, 16 R2)

D-05North Star = ≥4 native-desktop sessions / active user / week.

Sessions-on-native-clients (not minutes, not signups) is the one number proving engagement AND value where browser tools can't follow. (00, 14)

D-06PLG + community channel; defer paid CAC.

Seed Refold/TheMoeWay/Migaku Discords and r/LearnJapanese; time launches to immersion-challenge kickoffs. (00 bet 3, 06)

D-07Hold the $10B B2B/SDK option; do not open with it.

Lead consumer to earn the moat (capture primitive + telemetry), then expand to accessibility-compliance / embedded translation. (00 callout)

D-08First load-bearing hire = audio/OS-internals (Rust + CoreAudio/WASAPI) engineer.

The whole financial model and moat hinge on this one role; named/signed before scaling spend. (10, 11, 16 R5)

Glossary Terms used inline across the playbook, defined in Babelio's context.

ICP
Ideal Customer Profile — here, the serious immersion-method JA/KO/ZH learner.
DMU
Decision-Making Unit — Champion / Economic Buyer / Blocker / End User by title.
North Star
The single metric proving the company is winning: ≥4 native-desktop sessions/user/week.
WTP
Willingness To Pay — the learner band ($6–12/mo) Babelio must validate vs the COGS floor.
COGS
Cost Of Goods Sold — per-user STT+MT+TTS+infra cost (~$3.26/paid user blended).
ARPU
Average Revenue Per User — $12/mo in the base model, matching the middle tier.
LTV : CAC
Lifetime Value to Customer Acquisition Cost ratio — target ~3:1 in the model.
NDR
Net Dollar Retention — revenue from a cohort a year later, expansion minus churn.
MDE
Minimum Detectable Effect — smallest lift an experiment is powered to detect.
ICE
Impact × Confidence × Ease — the 1-10 score ranking the experiment backlog (09).
RACI
Responsible / Accountable / Consulted / Informed — the decision-ownership matrix (15).
STT / MT / TTS
Speech-To-Text / Machine Translation / Text-To-Speech — the three latency legs of the dub pipeline.
Helmer (7 Powers)
Hamilton Helmer's 7 durable-advantage tests; the OS audio tap scores 0/7 (not a moat).
Mom Test
Rob Fitzpatrick's interview method — ask about past behavior, not hypotheticals (E01 in 09).
Van Westendorp
Price-sensitivity meter (4 questions) used to bracket WTP before locking tiers (05, 12).
LOI
Letter Of Intent — a non-binding commit; validation target is ≥5 paying or 10 LOIs.