resolveVersion(uid) buckets each user from a hashed uid, reads the cached config/live or the variant per experiments/{id}, and defaults to live when no experiment is active — zero per-turn Firestore hit.3experiments/{id}
promptVersionId and corpusVersionId at once. A variant is always a coupled bundle; pair-scope changes the whole point.promptVersionId is written then). Nominate candidates on the Proof Spike · the Bench.resolveVersion(uid)config/live
experiments/{id}| Run ↕ | Variant pair (B) | Scope ↕ | % B ↕ | B turns ↕ | B activation ↕ | Status ↕ |
|---|---|---|---|---|---|---|
| Loading the run ledger… | ||||||
Activation is real version-stamped telemetry (joined days→run→arm); committed-30d / graduation depth accrue over time and are not fabricated here. One live pair at a time — releasing a winner supersedes the prior edition and routes everyone, subject to the imprimatur grant on the next screen.8
percentB) of users to a variant edition while the rest stay on the live one. Writes experiments/{id} { variantA, variantB, percentB, scope, status }. A variant is itself a coupled (promptVersionId, corpusVersionId) bundle — scope:'pair' moves both at once.resolveVersion(uid) deterministically buckets a user from a hashed uid into the live pair or an experiment variant per experiments/{id}, app-cached (module TTL, no per-turn Firestore hit), defaulting to config/live when no experiment is active. Same reader, same bucket, every turn — and no added latency.ground(retrieval)). So a variant always stamps both IDs, and a run is compatibility-guarded server-side: route only if the prompt's required fields ⊆ the corpus's retrievalInterface. A blocked variant returns an HTTP 409 with the field reason before it can route a single reader.config/live — every reader routes to the winner and experiments/{id}.status='released'. This becomes the live pair (its own audit entry). No deploy at any step.resolveVersion / config/live, so a bucketed reader receives the byte-identical pair the simulation ran. If they could diverge, the experiment would be measuring a fiction.config/live pair is in force; releasing a winner supersedes the prior edition. The backend allows exactly one active experiment against the live baseline — stop or release it before ordering the next.