Foundations Intermediate 2 minute read Updated 2026-06-26 UTC

Lineage and inheritance

The provenance graph required to reproduce, compare, promote, and retire every model descendant.

Research statusEstablished MLOps adapted to breeding systems Publication statePublished Reviewed byMichael Kappel Source reports2

Lineage is the system memory

A breeding system without lineage is a pile of files. Lineage records how each artifact came to exist, what it inherited, which evidence supported it, and where it was released. The record should be append-only and independent from model-generated descriptions.

Model lineage directed acyclic graph BASE 1.0signed parent ADAPTER CHILDparent + LoRA DISTILLED CHILDteacher coalition QUANTIZED CHILDQ4 deployment CHAMPION 2.0promoted after gates
Every descendant records parentage, operator, data lineage, evaluation evidence, and artifact hashes.

Minimum lineage record

pseudocode
lineage_record <- {
    artifact_id: CONTENT_HASH(all_package_files),
    parent_ids: [parent_a, parent_b],
    operator: "adapter_merge",
    operator_version: "2.1.0",
    operator_config_hash: HASH(config),
    base_family: "example-family-v3",
    tokenizer_id: "tokenizer-sha256",
    training_data_manifest: "dataset-manifest-sha256",
    evaluation_suite_id: "suite-2026-06",
    evaluation_evidence_id: "evidence-sha256",
    created_at_utc: "2026-06-26T00:00:00Z",
    created_by: "controlled-pipeline",
    approvals: ["ml-owner", "security-owner"],
    rollback_target: parent_a,
    lifecycle_state: "candidate"
}

Inherited versus recomputed properties

Do not blindly inherit metadata. Parentage, license obligations, restricted data exposure, and known vulnerabilities propagate unless explicitly resolved. Performance scores, calibration, resource use, and safety evidence must be recomputed for the descendant.

PropertyInherit?Required action
Parent identifiersYesRecord all direct parents.
License obligationsYesCompute the strictest combined obligations.
Data restrictionsYesPropagate unless a verified unlearning process applies.
Benchmark scoresNoRe-run on the descendant.
Safety approvalNoRe-evaluate under current policy.
Runtime compatibilityVerifyTest exact packaged artifact.
Known vulnerabilitiesYesMark unresolved until retested.
Rollback targetAssignChoose an artifact proven deployable.

Content-addressed identifiers

Use a cryptographic digest of the complete package rather than a mutable name such as best-model-final. Human-readable aliases can point to immutable identifiers, but promotions must never overwrite the underlying artifact.

Lineage queries that operations should answer

  • Which production requests used this model version?
  • Which parents and datasets contributed to it?
  • Which candidates share the same risky ancestor?
  • What changed between the champion and the failing canary?
  • Which artifacts must be retired if a license or dataset is revoked?
  • Can the exact evaluation and deployment package be reconstructed?

Archive strategy

Keep prior champions, currently referenced parents, audit evidence, and artifacts required for legal or scientific reproducibility. Retire redundant failed candidates according to policy. Store summaries separately from immutable evidence so searchability does not require rewriting the record.

Source reports used for this guide

These reports are preserved verbatim in the site archive. The guide above is an editorial synthesis and may narrow, qualify, or reorganize claims from the source material.