Stage 0 — Define the boundary
Choose one narrow capability with measurable outcomes. Document the request contract, data classification, current baseline, risk tier, and operational budget. Do not begin with a general assistant or unrestricted tool use.
Exit criteria: one signed champion package, one evaluation suite, one rollback path, and named owners for model, evaluation, security, and production.
Stage 1 — Build the immutable supply chain
Create content-addressed packages, manifests, lineage records, evaluation cards, and a file-backed registry. Implement package verification and a command that can reconstruct the current champion from source artifacts.
Exit criteria: artifacts cannot be overwritten; aliases update atomically; every package has provenance and a digest.
Stage 2 — Add a challenger loop
Choose one low-risk operator such as adapter training, quantization, or distillation. Generate a small number of descendants offline. Compare them with champion and no-op under equal budgets.
Exit criteria: one complete breeding cycle with a recorded negative or positive decision and no production impact.
Stage 3 — Add shadow and canary release
Mirror requests to an evaluated challenger, compare outputs, then expose a bounded low-risk cohort. Exercise rollback before expanding traffic.
Exit criteria: release evidence joins to lineage, canary abort is automatic, and rollback recovery time is measured.
Stage 4 — Add routing and specialization
Introduce a second capability or cost tier. Use explicit contracts and static routing before learned routing. Track route quality and traffic concentration.
Exit criteria: specialists have defined niches, a safe fallback exists, and router changes are separately versioned.
Stage 5 — Add a viability controller
Automate recommendations, not authority. The controller computes score decompositions and proposes no-op, archive, shadow, or retire. Human approval remains required for structural release.
Exit criteria: controller decisions are explainable, resource projections reconcile with actuals, and oscillation controls are tested.
Stage 6 — Distribute carefully
Add edge, site, or federated nodes only after package signing, revocation, telemetry, and rollback work centrally. Start with read-only local inference before local training.
Exit criteria: trust boundaries, update provenance, client attestation, and site-level rejection are operational.
Stage 7 — Increase automation by risk tier
Low-risk descendants may progress automatically through offline and shadow gates. High-risk changes—permissions, code, data class, tool use, or policy semantics—always require independent review.
Program timeline
A credible pilot can complete Stages 0–3 in roughly one quarter for one narrow capability. Later stages depend more on governance maturity than algorithmic sophistication. Do not compress the sequence by skipping lineage, evaluation, or rollback.
FOR stage IN roadmap
IMPLEMENT(stage.minimum_components)
RUN(stage.required_tests)
IF NOT stage.exit_criteria_pass
STOP_AND_REMEDIATE
END IF
END FORSource reports used for this guide
These reports are preserved verbatim in the site archive. The guide above is an editorial synthesis and may narrow, qualify, or reorganize claims from the source material.