B Babelio · Operating Playbook 08 · Retention
refresh: weekly
Artifact 08 — Demand Engine

Retention Model

The North Star is habit: ≥4 native-desktop sessions per user per week. This page sets the target cohort curve, names the aha moment that predicts it, instruments the onboarding path to that moment, and queues the five experiments that bend the curve. All targets are [HYP] — no cohort is measured yet; the thin-product cohort (growth.md §7 Week-3) produces the first real reading.

North Star
≥4sess/wk
Target W4 retention
28%
Target D30
12%vs 7% med
Monthly churn
6–8%
Bar for good Anyone on the team can open this week's cohort dashboard, read it against the curve below, and know in 30 seconds whether we are on track — without asking the founder what "on track" means. The aha threshold and every drop-off event name are decided, instrumented, and unambiguous.
Purpose. A target cohort curve plus an aha moment with a measurable threshold — so retention stops being a vibe and becomes a number the whole team reads the same way.
Status — PMF pending, curve unmeasured Traction is prototype-only: zero users, zero retained cohorts. We do not claim PMF until the day-14 Sean Ellis survey reads ≥40% "very disappointed" across ≥30 hands (growth.md §9). Every percentage on this page is a target to validate, not a result. The Week-3 thin-product cohort gives the first curve and the first PMF reading.

1 · Target cohort curve

Category benchmark: consumer-app retention medians are D1 26% / D30 ~7% (UXCam). A healthy retention curve must flatten — a flat asymptote above zero is the lightning-in-a-bottle signal (a16z). We target above-median because the immersion learner is a high-intent daily-habit user, not a casual installer.

Cohort point W1 W4 W12 M6
Babelio target (% retained) 55% 28% 18% 14%
Consumer-app median ~40% ~14% ~9% ~7%
Top-quartile habit app ~65% ~35% ~25% ~20%
Target curve vs consumer median — must flatten, not bleed to zero
40%
55%
W1
14%
28%
W4
9%
18%
W12
7%
14%
M6
Babelio target Consumer-app median
Why above-median is defensible The behaviour the curve measures is daily immersion — these users already watch native foreign content 7–21 times/week (audience.md). Retention is not "do they remember the app" but "did the app stay inside a habit they already have." The M6→14% flat tail (double the consumer median) is the bet: if Babelio embeds into the existing immersion routine, the curve flattens; if it's a novelty, it bleeds to the 7% median. The shape of the tail is the PMF signal.

2 · The aha moment

One event, two halves, a hard threshold. The aha is not "installed" and not "ran one session" — it is the moment the user both experiences the magic and captures a keepable artifact. Both halves must fire inside the first 7 days.

Aha moment — defined & instrumented

First dual-track session + first exported Anki card, within 7 days of install.

Threshold — both must fire
≥1 + ≥1≥1 dual-track session >5 min AND ≥1 exported/shared sentence-mining card, ≤ day 7
Activation target by D7
45%of installs reach the aha event by day 7 (growth.md §9)

The dual-track session is the magic — the user hears the original voice and a live translation under it, inside a native app a browser extension can't touch. The exported Anki card is the keepable artifact — it converts a fleeting "wow" into a study asset the user owns, and it is the same action that seeds the §3 growth loop. Coupling activation to the artifact means activation and loop-supply are one event.

Hypothesized cohort evidence (to confirm Week-3): users who export their first card within 7 days are expected to retain at ~4× the W4 rate of users who only run a session and never export — the export is the commitment that turns a try into a habit. This correlation is the single metric the Week-3 cohort must validate before we anchor onboarding on it. If it doesn't hold, the aha is re-defined to whatever the data says predicts W4 retention.

3 · Onboarding flow

Seven steps from download to aha. The riskiest step is install-trust (Gatekeeper / antivirus, growth.md §4a) — every step is instrumented so we can see exactly where the funnel leaks. The whole flow is designed to reach dual-track session + first export as fast as possible.

# Step Target time Drop-off risk Event name
1 Download & first launchSigned + notarized installer; verified-publisher path, no red warning. <90s Gatekeeper / SmartScreen wall scares the user off. app_first_launch
2 Permission & trust screen"Why these permissions" + "what we DON'T do" copy shown at the OS prompt (product.md §5.5). <60s Audio-capture permission denied; "is this a keylogger?" audio_permission_granted
3 Latency proof — canned clip testRun a sample clip end-to-end; show measured ms so trust is earned by demonstration. <30s User doubts the <700ms claim before risking real content. latency_demo_passed
4 Auto-detect & capture-pickerVAD flags the running foreign-audio app; one-click "subtitle this now", target language remembered. <20s No foreign-audio app open → empty first-run, nothing to try. capture_source_selected
5 First dual-track session ★Captions + whisper-dub under preserved original; the magic moment. Aha half #1. <5 min in Latency feels like an echo / captions lag → trust breaks. dualtrack_session_started
6 First exported Anki card ★Capture a line → gloss → export to Anki / share as recap. Aha half #2 + loop seed. ≤ day 7 User never discovers export; magic stays fleeting, no keepable asset. artifact_created
7 Save presets & return habitSaved language/app presets + "your mined cards" history → zero-friction return next session. D1–D7 No reason to reopen → one-and-done; never reaches ≥4 sessions/wk. preset_saved · session_returned
Kill-criterion on the flow If <40% reach activation (steps 5+6) even when hand-onboarded via Loom, the product is not yet retainable — fix the funnel before any paid spend (growth.md §7 Week-3). Watch install-completion rate (steps 1→2): <50% = friction is the bottleneck, fix before scaling.

4 · Top 3 churn signals

Each churn driver gets a leading indicator we can fire on before the user is gone, and a decided mitigation. Monthly churn target band is 6–8%; these three signals explain most of it.

Churn signal Leading indicator (fire early) Mitigation (decided)
"Makes me lazy" — the intermediate-plateau learner who feels the tool is doing the work and quits to "earn it" dub-min share rising while subtitle-min & card exports fall Keep subtitle/dual-track mode generous and never throttled; reframe with a weekly digest — "you understood X% on your own this week" — so the tool reads as a coach, not a crutch.
Metered-overage sticker shock — a rare dub-heavy user hits the meter and rage-quits dub-min > 80% of allotment before day 20 of cycle In-app meter that suggests the right tier before the overage hits; soft-throttle dub to caption-only ("dub paused — heavy usage") rather than billing a surprise. No silent overage charge.
Quality / latency break — captions lag or a hallucinated line destroys trust mid-session auto-degrade ("fast mode") trigger-rate spike or a session abandoned <5 min after start Eval-gated deploys hold p95 <700ms and insertion <2%; degrade visibly to caption-only rather than letting dub echo. Flag the abandoned-session line into Session Review → eval set, so the failure improves the model.

5 · Five retention experiments queued

Each links to its row in 09 · Experiment Backlog (ICE-scored, fully designed there). These are the five that move the curve above — ordered by what unblocks activation first.

Retention experiment queue → see 09 for full ICE design
R-01
Force the export in onboarding

Add a guided "export your first card" step right after the first dual-track session, before the user leaves. Hypothesis: lifts D7 activation (both aha halves) from baseline toward 45%.

metric: D7 activation %
backlog #09 → exp 03
R-02
"Understood X% on your own" weekly digest

Email/in-app digest reframing the tool as a coach (cards mined, hours of native content, self-comprehension %). Hypothesis: counters the "makes me lazy" churn driver; lifts W4→W12 retention.

metric: W12 retention
backlog #09 → exp 07
R-03
Pre-overage tier nudge

In-app meter that suggests the right tier (or soft-throttles dub to caption-only) before the dub allotment runs out. Hypothesis: removes sticker-shock churn from heavy-dub users; protects the 6–8% band.

metric: monthly churn %
backlog #09 → exp 11
R-04
Immersion-streak digest at D30

"You mined 42 sentences / 6 hrs native content this month" + new-app/language coverage nudge, leaning into the learner identity that drives community referral. Hypothesis: flattens the M6 tail toward 14%.

metric: M6 retention
backlog #09 → exp 14
R-05
Visible-degrade vs silent-echo A/B

When latency spikes, A/B a visible "fast mode" caption-only drop against the current behaviour, measuring mid-session abandonment. Hypothesis: visible degrade beats silent echo on session-completion and W1 retention.

metric: session-completion %
backlog #09 → exp 18

Retention don'ts

Don't measure retention as MAU

MAU is vanity. The North Star is ≥4 native-desktop sessions/user/week kept on over real content — "did they trust it enough to leave it running over a real lecture." Cohort the session metric, not the login.

Don't claim PMF off targets

Every curve here is [HYP]. PMF is claimed only at ≥40% "very disappointed" across ≥30 hands on the day-14 Sean Ellis survey — not before, not from a target.

Don't throttle the subtitle mode

Subtitle/dual-track is the cheap-COGS habit core (~$0.31/active-hr). Throttling it punishes the exact daily behaviour the curve depends on. Meter dub, never subtitles.

Linked artifacts
  • 09 · Experiment BacklogThe five queued experiments (R-01…R-05) map to ICE-scored, fully-designed rows there.
  • 07 · Growth LoopThe aha artifact (first exported card) is the same event that supplies the §3 sentence-mining loop — activation and loop-supply are one action.
  • 14 · KPI DashboardNorth Star, W4/D30 retention, activation %, and monthly churn from this page feed the KPI dashboard's retention block.
Babelio · Operating Playbook · 08 — Retention Model Refresh weekly · source of truth: research/growth.md §9 + research/product.md §1