Lineage is the system memory
A breeding system without lineage is a pile of files. Lineage records how each artifact came to exist, what it inherited, which evidence supported it, and where it was released. The record should be append-only and independent from model-generated descriptions.
Minimum lineage record
lineage_record <- {
artifact_id: CONTENT_HASH(all_package_files),
parent_ids: [parent_a, parent_b],
operator: "adapter_merge",
operator_version: "2.1.0",
operator_config_hash: HASH(config),
base_family: "example-family-v3",
tokenizer_id: "tokenizer-sha256",
training_data_manifest: "dataset-manifest-sha256",
evaluation_suite_id: "suite-2026-06",
evaluation_evidence_id: "evidence-sha256",
created_at_utc: "2026-06-26T00:00:00Z",
created_by: "controlled-pipeline",
approvals: ["ml-owner", "security-owner"],
rollback_target: parent_a,
lifecycle_state: "candidate"
}Inherited versus recomputed properties
Do not blindly inherit metadata. Parentage, license obligations, restricted data exposure, and known vulnerabilities propagate unless explicitly resolved. Performance scores, calibration, resource use, and safety evidence must be recomputed for the descendant.
| Property | Inherit? | Required action |
|---|---|---|
| Parent identifiers | Yes | Record all direct parents. |
| License obligations | Yes | Compute the strictest combined obligations. |
| Data restrictions | Yes | Propagate unless a verified unlearning process applies. |
| Benchmark scores | No | Re-run on the descendant. |
| Safety approval | No | Re-evaluate under current policy. |
| Runtime compatibility | Verify | Test exact packaged artifact. |
| Known vulnerabilities | Yes | Mark unresolved until retested. |
| Rollback target | Assign | Choose an artifact proven deployable. |
Content-addressed identifiers
Use a cryptographic digest of the complete package rather than a mutable name such as best-model-final. Human-readable aliases can point to immutable identifiers, but promotions must never overwrite the underlying artifact.
Lineage queries that operations should answer
- Which production requests used this model version?
- Which parents and datasets contributed to it?
- Which candidates share the same risky ancestor?
- What changed between the champion and the failing canary?
- Which artifacts must be retired if a license or dataset is revoked?
- Can the exact evaluation and deployment package be reconstructed?
Archive strategy
Keep prior champions, currently referenced parents, audit evidence, and artifacts required for legal or scientific reproducibility. Retire redundant failed candidates according to policy. Store summaries separately from immutable evidence so searchability does not require rewriting the record.
Source reports used for this guide
These reports are preserved verbatim in the site archive. The guide above is an editorial synthesis and may narrow, qualify, or reorganize claims from the source material.