Goal
The first model breeding lab should be boring. It should run offline, use signed parents, generate bounded descendants, evaluate independently, and promote only through normal release controls.
Minimal architecture
- Source registry: immutable parent models, adapters, data manifests, and licenses.
- Candidate factory: allowlisted operators such as distill, adapter train, quantize, and merge.
- Evaluator: fixed suites for task, calibration, robustness, safety, and resource cost.
- Viability controller: compares candidates against champion and no-op.
- Release controller: shadow, canary, champion, rollback.
- Ledger: records lineage, decisions, and scorecards.
PROCEDURE run_breeding_lab_round(failure_cluster)
parents <- SELECT_APPROVED_PARENTS(failure_cluster)
candidates <- []
FOR each operator IN ALLOWED_OPERATORS
candidate <- GENERATE_IN_SANDBOX(operator, parents, failure_cluster)
candidates.ADD(candidate)
END FOR
scorecards <- EVALUATE_ALL(candidates, frozen_suites)
decision <- VIABILITY_SELECT(scorecards, current_champion, no_op)
IF decision.promote
RELEASE_TO_SHADOW(decision.candidate)
ELSE
ARCHIVE_CANDIDATES_WITH_REASONS(candidates, scorecards)
END IF
END PROCEDUREWhy offline first
Offline generation gives teams time to review provenance, evaluator behavior, and cost. It also prevents candidate models from discovering and adapting to live release mechanics.
Exit criteria for v1
The lab is ready for production-adjacent shadow mode when it can reproduce candidates, reject bad ones, explain no-op decisions, verify hashes, and rollback every promoted artifact.
Source reports used for this guide
These reports are preserved verbatim in the site archive. The guide above is an editorial synthesis and may narrow, qualify, or reorganize claims from the source material.