Operating System · Artifact 12

Q1 OKRs

The first operating quarter has exactly one job: find out if the wedge is real and ship a dual-track MVP a real immersion learner keeps running. Four objectives, twelve outcome-KRs, every one with a baseline, a target, a single owner and a checkpoint date. Pre-revenue, pre-PMF — so these are validation OKRs, not growth OKRs.

Objectives

Key results

North Star

≥4sess/wk

Quarter

Q1first

How this quarter is scored

Quarter clock: kickoff Jun 1, mid-quarter check Jul 15, close Aug 31 (~13 weeks; maps 1:1 to the six phases in 17-90-day-plan.html). Score each KR 0.0–1.0 at close; 0.7 is "hit" for a validation quarter, not 1.0 — these are learning bets, and a clean kill is a win, not a miss. The whole quarter rests on one dependency: the audio engineer (Objective 3). If that hire isn't committed by the Jun 20 checkpoint, Objectives 2 and 4 cannot complete and the quarter is re-planned around the bootstrap path (10-financial-model.html).

Objective 1 — Validate the wedge

Customer truth Owner: Founder/CEO

Prove that serious immersion learners feel the live / native-content pain badly enough to pay — or kill the wedge cleanly.

	Key result (outcome)	Baseline		Target	Owner	Checkpoint
1.1	% of interviewed learners who recall a specific painful live / native-content incident in the last 30 days (Mom Test verbatim).	0% (0 interviews)	→	≥30% / 15–20	Founder	Jun 20
1.2	Paying / LOI users from the concierge metered test who pay a rate that clears subtitle COGS (~$0.31/active-hr).	0	→	≥5	Founder	Jul 15
1.3	WTP ceiling vs dub-COGS gap resolved with real data: Van Westendorp + metered $5/hr test run on paying ICP buyers.	unvalidated	→	30–50 buyers	Founder	Aug 15

Why this objective, over the alternatives

We picked "validate the wedge" over "start growth / chase installs" because REVIEW.md scores the whole research analytically fundable but real-world unfundable today on exactly one gap: zero user evidence (1 [PAST] datum, 0 interviews). Every CAC, loop, and retention number in the playbook is currently a [HYP]. Spending the quarter on acquisition would compound that bet on top of an unvalidated foundation. The single most important number this quarter is not MRR — it's whether ≥30% of real learners recall a specific painful incident; if they don't, the kill-criterion fires and we save a year. We deliberately framed KRs as outcomes (recall %, paying users, gap resolved) — not "do 20 interviews," which would be an activity that can be 100% complete and still teach us nothing.

Kill-criterion (this is a win, not a miss) If KR 1.1 lands <30% recall or nobody in KR 1.2 pays a rate clearing COGS — stop, do not ship the MVP. Re-segment (clipper/creator hedge) or pivot to the B2B/SDK trajectory before burning the audio-engineer runway. A clean kill at the Jul 15 checkpoint scores the objective 1.0.

Objective 2 — Ship the dual-track MVP

Product · the thing extensions can't do Owner: Audio/OS engineer

Get a real immersion learner watching native-desktop audio with live captions + whisper-under-original, passing the eval gate, on one OS.

	Key result (outcome)	Baseline		Target	Owner	Checkpoint
2.1	Glass-to-glass p95 latency on real lecture audio (whisper-dub layer) — the launch "done" gate.	prototype, unmeasured	→	<700ms	Audio eng	Aug 15
2.2	Native per-process audio capture verified working on real native clients per target OS (e.g. desktop Webex, an LMS player, VLC).	0 clients	→	≥3	Audio eng	Jul 15
2.3	babelio-evals gold set live and gating deploys: MT adequacy on the 20-example set, with capture/eval telemetry logging from session 1.	0/5 (no harness)	→	≥4/5	Audio eng	Jul 15

Why this objective, over the alternatives

We picked "narrowest native-desktop capture on ONE OS" over "polish a broad multi-OS app" because product.md is explicit: the native per-process audio capture is the one thing browser extensions structurally cannot do — and it's also the hardest, riskiest part of the build (Wk 3–6 = most of the risk). Shipping breadth before that primitive is proven would burn the quarter on the easy 80% while the load-bearing 20% stays unvalidated. We made dual-track the default (captions + whisper-under-original), not auto-mute dub, because the resolved ICP — immersion learners — needs the original voice preserved; muting destroys their learning job (the P0 ICP/mode collision REVIEW.md flagged). And we gate KR 2.1/2.3 on the babelio-evals harness because product.md rule 2 is "evals before features" — without the gate, latency and hallucination drift silently and we'd ship a vitamin.

Hard dependency This entire objective cannot start without Objective 3's hire on payroll. The 10-week MVP timeline in product.md §3 is explicitly contingent on a committed Rust + CoreAudio/WASAPI engineer. No engineer → re-plan to bootstrap (subtitle-only, slower) and move 2.1–2.3 to Q2.

Objective 3 — Close the load-bearing hire

Team · the single dependency load-bearing Owner: Founder/CEO

Commit a Rust + CoreAudio/WASAPI engineer to payroll before any MVP work begins — or the whole quarter is fiction.

	Key result (outcome)	Baseline		Target	Owner	Checkpoint
3.1	Audio/OS-internals engineer signed and on payroll (or committed co-founder), able to ship native capture across both OSes.	0/1 (unconfirmed)	→	1/1 signed	Founder	Jun 20
3.2	Founder-market-fit gap in BRIEF.md closed: technical background recorded and the OS-internals plan documented for investors.	absent (Team 0/50)	→	documented	Founder	Jun 20
3.3	Native capture spike de-risks the riskiest primitive: one real native client's per-process audio captured end-to-end within the trial period.	0 clients captured	→	1 client	Audio eng	Jul 1

Why this objective, over the alternatives

We made hiring a standalone objective rather than a line buried in Product because REVIEW.md scores Team 0/50 ("no founder-market fit = no meeting") and names this exact hire as the P0 dependency the entire moat and 10-week timeline rest on. 10-financial-model.html calls it the load-bearing cost — ~70% of R&D spend through Q1–Q3. The honest sequencing is brutal: this objective gates Objectives 2 and 4 entirely, so its checkpoint (Jun 20) is the earliest and the most consequential of the quarter. We chose KR 3.3 — an actual capture spike inside the trial — over a softer "interview N candidates," because the failure mode here isn't finding someone, it's discovering after onboarding that the OS-internals work is harder than scoped. A paid trial spike surfaces that in week 3, not month 3.

▲

If 3.1 misses, the quarter changes shape — do not pretend otherwise

No committed engineer by Jun 20 → Objectives 2 and 4 are formally de-scoped to Q2, and the quarter re-plans around the bootstrap subtitle-only path in 10-financial-model.html. The worst outcome is leaving Objective 2's KRs on the board, scoring them red in August, and learning nothing — that's a hiring failure dressed as an execution failure.

Objective 4 — Reach the North Star with real users

Activation & habit north star Owner: Founder/CEO

Get a hand-onboarded cohort to ≥4 native-desktop sessions/user/week and take the first honest PMF reading.

	Key result (outcome)	Baseline		Target	Owner	Checkpoint
4.1	Hand-onboarded wedge users sustaining the North Star: ≥4 native-desktop sessions/user/week for 2 consecutive weeks.	0 users	→	≥15	Founder	Aug 31
4.2	Activation rate — % of installs reaching the activation event (first exported/shared sentence-mining card) by D7.	unmeasured	→	≥45%	Founder	Aug 15
4.3	First Sean Ellis PMF reading administered at day-14 across the cohort — measured, not claimed.	0 hands, no reading	→	≥30 hands surveyed	Founder	Aug 31

Why this objective, over the alternatives

We picked "North Star sessions + a real PMF reading" over "loop K / viral growth" because the honest sequence is habit-then-distribution: product.md's North Star is ≥4 native-desktop sessions/user/week ("did they trust it enough to leave it running over a real lecture") — MAU is vanity. The loop factor K and channel-test CAC matter, but they're a Q2 bet once we know the product is sticky; chasing virality before a 15-user cohort sustains the North Star would be optimizing a funnel into a leaky bucket. We deliberately set KR 4.3 as "administer the survey to ≥30 hands," not "hit 40% very-disappointed" — because REVIEW.md flags PMF as PENDING and forbids claiming it until measured. The outcome we own this quarter is getting an honest reading on enough hands; the 40% threshold is a finding we report, not a target we set, so we don't pressure ourselves into faking PMF.

OKR anti-patterns · the check we ran before locking these

Run this checklist before adopting any OKR set — ours or next quarter's. If a KR trips one of these, it goes back to the drafting table.

✗

KRs are activities, not outcomes.

"Ship the MVP" / "do 20 interviews" can be 100% complete and teach nothing. Our fix: every KR is a measured outcome — latency <700ms, ≥30% recall, ≥4 sessions/wk — not a task done.

✗

Too many objectives.

A pre-PMF team chasing 8 objectives achieves none. Our fix: exactly 4, all serving one theme — validate the wedge and ship the thing that validates it. Growth/loop OKRs are explicitly deferred to Q2.

✗

No single owner per KR.

"The team owns it" means nobody owns it. Our fix: every KR names one person (Founder or Audio eng) — and at this size that's the honest map of who actually does the work.

✗

No baseline — you can't tell if you moved.

A target with no starting number is a wish. Our fix: every KR carries today's baseline (mostly 0 / unmeasured — honest for a prototype) next to its target.

✗

No mid-quarter checkpoint.

Discovering a miss at quarter-close is too late to course-correct. Our fix: staggered checkpoint dates (Jun 20 → Aug 31), with the load-bearing hire checked first so a miss re-plans the quarter early.

✗

Sandbagging or moon-shotting the targets.

Targets you'll definitely hit teach nothing; ones you can't hit demoralize. Our fix: targets are pulled from research benchmarks (45% activation, <700ms, ≥30% recall), and 0.7 is "hit" — leaving honest stretch.

✗

Vanity metrics as KRs.

Installs, signups, MAU look like progress and hide a leaky bucket. Our fix: the North Star is sustained sessions and a measured PMF reading, not download counts — habit before headcount.

✗

OKRs disconnected from the strategic bet.

A set of "nice things to improve" with no through-line. Our fix: all 4 objectives ladder to the one Q1 bet from the strategy memo — validate the wedge, ship the dual-track MVP — and each rationale defends the choice against a named alternative.

Cross-references

00-strategy-memo.htmlThe Q1 strategic bet these four objectives ladder up to (validate wedge + ship dual-track MVP).
17-90-day-plan.htmlWeek-by-week execution of these KRs across six 15-day phases; checkpoint dates align.
14-kpi-dashboard.htmlWhere these KR metrics (North Star sessions, activation %, PMF, latency) are tracked weekly.
11-hiring-plan.htmlThe audio-engineer scorecard behind Objective 3 — mission, outcomes, comp range.
10-financial-model.htmlThe load-bearing cost and the bootstrap re-plan if Objective 3 misses.
16-risk-register.htmlThe hire dependency, churn, and WTP-vs-COGS gap as tracked P0/P1 risks with tripwires.

Q1 OKRsOKR на Q1

How this quarter is scoredКак считается этот квартал

Objective 1 — Validate the wedgeЦель 1 — Проверить клин

Objective 2 — Ship the dual-track MVPЦель 2 — Выпустить dual-track MVP

Objective 3 — Close the load-bearing hireЦель 3 — Закрыть несущий найм

If 3.1 misses, the quarter changes shape — do not pretend otherwiseЕсли 3.1 не закрыта, квартал меняет форму — не делать вид, что нет

Objective 4 — Reach the North Star with real usersЦель 4 — Достичь North Star с реальными пользователями

OKR anti-patterns · the check we ran before locking theseАнти-паттерны OKR · проверка перед фиксацией

Q1 OKRs

How this quarter is scored

Objective 1 — Validate the wedge

Objective 2 — Ship the dual-track MVP

Objective 3 — Close the load-bearing hire

If 3.1 misses, the quarter changes shape — do not pretend otherwise

Objective 4 — Reach the North Star with real users

OKR anti-patterns · the check we ran before locking these