B Babelio Operating Playbook · 04
quarterly
Market sizing — bottom-up

$13M wedge we can defend — not a $720M headline we can't.

Every number below is built bottom-up from named, sized communities and one ARPU ($12/mo = $144/yr). We lead with the tight wedge SAM and mark the broad-consumer figure as a deliberate ceiling, not a forecast.

Bar for good Survives a partner-meeting interrogation — every cell is sourced, every ratio is defended, and the SOM-as-%-of-SAM is explained, not hidden.
TAM
$72M learner
SAM · wedge
$13M
SOM Y3
$1.9M ARR
ARPU
$144/yr

The build Bottom-up, not a slice of a headline.

Babelio is a consumer subscription, so the unit is learners × annual price. We size from the watering holes the ICP actually lives in, not "all language learners." TAM is the serious immersion-learner population; SAM is the desktop-capable, English-first, already-paying core; SOM is summed bottom-up by channel — never as a percentage of SAM.

TAM
500,000 serious immersion learners × $144/yr
$72M
SAM
desktop-capable · English-first · paying core = ~18% of TAM = 90,000 × $144
$13M
SOM
Y3
Σ (channel trials × paid conv. × $144) + annual hedge — see build below
$1.9M
CEILING
broad-consumer headroom — 5,000,000 heavy users × $144. NOT modeled against.
$720M
Why the $720M is a ceiling, not a TAM

"Anyone watching foreign video" is the contested zone where Zoom's free Voice Translator and free browser extensions (DubTab, Whisperr, Language Reactor) already cap pricing — and audience research classes that crowd as Anti-ICP: no trigger, no community, zero willingness to pay next to free captions. We show the $720M only to prove headroom exists as mobile capture, casual-learner adjacency, and non-English UI unlock the SAM toward it. We do not forecast against it.

Sourced variables Every cell has a source and a confidence.

The two soft cells — the 500k serious-learner count and the 18% paying-core cut — are derived (community sizes minus heavy overlap), so we mark them medium confidence and refresh them with the first cohort's real conversion data.

Variable Value Source & derivation Conf.
Community pool (raw) ~4.8M r/languagelearning ~3M + r/LearnJapanese ~700k + r/Korean ~700k + r/ChineseLanguage ~400k, heavy overlap.reddit.com/r/LearnJapanese · Refold / TheMoeWay / Migaku Discords MED
Serious immersion learners (TAM count) 500,000 "Serious, paying, desktop-immersion" core = conservative fraction of the raw pool after dedup, across JA/KO/ZH/ES.research/market.md §3 · research/audience.md §1 (derived) MED
ARPU $144/yr $12/mo Pro tier; fits validated $6–12 WTP band. One ARPU across all docs.research/monetization.md §3 HI
SAM cut (reachable %) ~18% English-first ∩ macOS 14+/Win11-capable ∩ already-paying for tools. Intentionally tight; widens with mobile + non-English UI.research/market.md §3 (derived) MED
Existing tool spend (WTP anchor) $60–180/yr Migaku ~$8/mo, Language Reactor ~$5–9/mo, Anki add-ons — proves a paying habit at our price.migaku.com · languagereactor.com HI
Free→paid conversion 6% Conservative end of the 2026 reverse-trial band (4–6% good, 8–12% great). Used for the SOM build.growthunhinged.com · adapty.io MED
S2S translation segment (top-down) $481.6M 2025, ~9.5% CAGR to 2035 — used only as the sanity ceiling, discounted 5–10×.expertmarketresearch.com LO

3-year SOM build · by channel Summed from channels, never a % of SAM.

PLG through immersion communities is the engine; the annual / creator-clipper hedge is the secondary segment. Outbound and paid ads stay near-zero — this is a consumer, community-led motion, not a sales-led one, so we do not budget meaningful outbound CAC. Y3 cumulative ARR ≈ $1.9M.

Channel Y1 ARR Y2 ARR Y3 ARR Driver
PLG · immersion communities $150K $650K $1,300K →1,400 trials/mo × 6% paid × $144; Refold/MoeWay/Migaku, r/LearnJapanese, polyglot-YouTuber sponsorships, challenge-kickoff timing.
Annual + creator/clipper hedge $60K $320K $600K →~3,000 annual seats × ~$120 effective + metered-dub expansion; secondary segment (audience.md §1).
Content / SEO organic $10K $60K $120K "How to immerse in [lang]" guides, app-coverage pages; compounds slowly.
Outbound / paid ads $0 ~$0 ~$0 Deliberately near-zero — community-led consumer motion; paid ads lose to free captions at $12/mo.
Total cumulative ARR $220K $1.03M $1.9M Y1 ≈ base 12-mo projection ($185K ARR EoY); Y3 SOM ≈ 14% of $13M wedge SAM.
Why SOM is 14% of SAM (above the <5% rule of thumb) It is an artifact of the deliberately-narrow SAM cut, not an over-aggressive capture. Against the realistic English-first immersion-learner population (the ~$72M TAM), the same Y3 SOM is ~2.7%; against the $720M ceiling it is ~0.25%. We chose to lead with the tight wedge for honesty, which inflates the %-of-SAM optics — the absolute capture is conservative.

Source tiers Primary, corroborating, citation-only.

Tier 1 — primary / counts (load-bearing)
  • r/LearnJapanese, r/Korean, r/languagelearning, r/ChineseLanguage — community subscriber counts (the TAM denominator).
  • Refold · TheMoeWay · Migaku Discords — active immersion-learner population (channel reach).
  • Claryti remote-meeting statistics 2026adjacency sizing for the ceiling.
Tier 2 — corroborating (pricing / WTP / conversion)
  • Migaku (~$8/mo) · Language Reactor (~$5–9/mo) — existing tool spend = WTP anchor.
  • Growth Unhinged · adapty.io — 2026 reverse-trial free→paid benchmarks (6% used).
  • DubTab · Slator (Zoom Voice Translator)free-substitute price floor capping the ceiling.
Tier 3 — citation-only (headline numbers, discounted 5–10×)

Top-down sanity check Does the bottom-up fit inside the top-down?

Verdict — match acceptable Top-down headline: the real-time S2S translation segment is $481.6M (2025), inside a $3.68B AI-translation market (2026). Bottom-up: our learner TAM is $72M — roughly 15% of the S2S segment, which is plausible for the desktop-app immersion-learner slice of a meeting-dominated category. Wedge SAM $13M and Y3 SOM $1.9M sit far below any top-down ceiling. Ratio: bottom-up TAM ÷ top-down S2S ≈ 0.15 — match acceptable, no mismatch flagged. Any TAM above the low-hundreds-of-$M for real-time desktop translation would be geography/segment inflation, which is exactly why the $720M figure is fenced off as a ceiling, not a base case.
See also 00 strategy memo for the wedge thesis · 01 ICP brief for the 500k-learner segmentation · 05 pricing for the $12 ARPU & COGS basis · 10 financial model for the 36-month projection these SOM numbers feed.
Babelio · TAM / SAM / SOM · Operating Playbook 04 Region: Global · Refresh: quarterly