Pre-seed · Real-time AI translation layer

Live subtitles for any desktop app.

Babelio captures audio from any native desktop app — and overlays a real-time translation in your language, under the original voice. The reach a browser extension can't touch.

01 · Problem

Translation lives in silos.

A serious language learner watches native foreign content on desktop for hours a day. The live stream has no fan-sub. The lecture player is a native app, not a browser tab. So they pause every sentence, look up words, lose the flow — and give up.

Pain frequency
7–21/wk
Already spend on tools
$60–180/yr
Comprehension on native
~60%
02 · Solution

One OS-level layer. Any app.

A desktop app for Mac and Windows that captures audio from any running process and overlays a real-time translation — in under 700ms, the threshold where it reads as interpretation, not a delayed echo.

Pick the app, pick the language once, toggle on. It feels like one button — that's the whole promise.

03 · Why Now

Two curves crossed in 2025.

One enables the experience. One opens a lane nobody else can take. We're honest about which is which.

The honest read

The monetizable edge rests on inflection #2 alone. "Translate any app for everyone" rides a trend everyone rides. The defensible play is native-desktop capture — and what we build on top of it.

04 · Market

A $13M wedge inside a $72M TAM.

Bottom-up, not market-report inflation. We lead with the wedge we can actually win, not the ceiling.

TAM
$72M
500k serious immersion learners × $144/yr
SAM
$13M
~90k desktop, English-first, already-paying — the headline wedge
SOM Y3
$1.8MARR
Bottom-up by channel, not % of SAM

The S2S translation segment is ~$481.6M (2025); AI-in-translation hits $3.68B in 2026 at 25% CAGR. A broad-consumer ceiling — 5M heavy English-first users × $144 — would be ~$720M, but that zone is already capped by Zoom's free feature and free browser extensions. We treat $720M as headroom, never as the base case.

Sources: Expert Market Research (S2S $481.6M), The Business Research Company (AI translation $3.68B / 25.2% CAGR).

05 · Product

Pick the app. Toggle on.

The whole flow is one screen and one switch. Everything else runs in the background.

North Star: weekly native-desktop sessions translated end-to-end ≥ 4 per active user. Did they trust it enough to leave it running over a real lecture.

06 · Business Model

$12/mo, metered for margin.

Hybrid: flat $12/mo Pro base + metered translation-minutes via a reverse trial. Cheap subtitle minutes are the monetized core; dub minutes are metered to protect margin.

COGS subtitle / dub
$0.31/$0.50 hr
Gross margin (Pro)
73%
LTV : CAC
~3:1

The #1 thing we're validating

The WTP ceiling ($6–12/mo, capped by free extensions) sits near the COGS floor for heavy dub usage. We resolve this by mode — subtitle is the cheap hero mode learners actually want. Van Westendorp + a metered $5/hr concierge test on 30–50 buyers proves it before we lock tiers.

07 · Traction

Prototype. Zero users — and we'll say so.

A working prototype exists. No users, no revenue, no LOIs yet. Everything past that is a validation plan, not a claim. We'd rather be honest than inflate.

PMF status: PENDING. We claim PMF only at ≥40% "very disappointed" (Sean Ellis) across ≥30 hands. Not before.

08 · Competition

Everyone owns a box. Nobody owns ours.

Axes: app coverage (single-platform ↔ OS-wide) × output (captions ↔ voice dub). Our only defensible cell is native-installed-app voice dub with original audio preserved.

Player
Reach
Native apps
Live audio
Keeps original
Zoom AI Voice Translator
Zoom only, 5 langs
DubTab / Whisperr
Browser tabs, 60+ langs
Teams + DeepL Voice
Teams only
Free captions (do-nothing)
Per-platform
~
Babelio
OS-wide, native

Big Tech will ship in-product translation — but an OS-wide cross-vendor layer cannibalizes nobody's core, so no single platform owner is incentivized to build it. The open lane survives precisely because it sits between the walled gardens.

09 · Moat

The capture is a head start. The telemetry is the moat.

We're blunt: the audio tap uses public CoreAudio / WASAPI APIs — a funded competitor replicates it in a quarter. It buys time to collect the one asset that compounds.

The $10B headline thesis

The $10B outcome is not the consumer app — it's becoming the cross-application real-time-dub primitive that other software embeds. The telemetry → an eval-proven "best dub per app / codec / network" capability → packaged as a virtual-mic / SDK that conferencing, accessibility, e-learning, and contact-center vendors license rather than rebuild.

When the model is free, value accrues below it — at the capture + eval layer. That's where we sell.

10 · Go-to-Market

Seed the communities that already mine sentences.

PLG self-serve + founder-led community seeding for the first 100. No outbound sales at $144/yr. The buyer is a single learner who already funds their own immersion stack.

Install friction (Gatekeeper / antivirus) is the #3 risk — mitigated by notarized signed builds, a first-run trust playbook, and seeding via a trusted community member, not a cold drop.

11 · Team

The honest open gap. One hire decides everything.

We won't pretend otherwise: the entire moat and the 10-week MVP hinge on one rare profile — a Rust + CoreAudio / WASAPI engineer who can ship per-process native audio capture across both OSes. That person is not yet confirmed.

Founder decisions in flight

Funding path (raise vs bootstrap to ~$10K MRR) and the audio-engineer commitment are open. The economics support a conditional bootstrap — Cartesia retail COGS is viable from day one, breakeven ~460 paying users — so the raise is to compress time-to-moat, not to survive.

12 · The Ask

Raise to compress time-to-moat.

Pre-seed · 15–18 months runway
$1.2–1.5M pre-seed
to ~$400–500K ARR + a defensible native-desktop-capture wedge with a live telemetry flywheel
55%
Engineering

The Rust + CoreAudio/WASAPI capture hire + founder. Ships the native pipeline, telemetry flywheel, eval harness.

25%
Validation & GTM

4-week field plan, community seeding, immersion-YouTuber integrations, the loop-K test.

20%
Infra & COGS buffer

Cartesia / Deepgram inference, signing & notarization, billing, the on-device-STT margin lever.

Milestones: capture engineer committed (week 0) → 4-week validation passed with real loop-K and CAC (month 1) → monetizable MVP shipped at ≤$0.50/active-hr (month 4) → ~$10K MRR (month 9–10) → the telemetry asset that underwrites the B2B/SDK Series A.