Architecture Advanced 2 minute read Updated 2026-06-26 UTC

Governed mutation boundaries

How to separate code breeding, model breeding, evaluation, and release permissions.

Research statusSafety architecture synthesis Publication statePublished Reviewed byMichael Kappel Source reports4

The central boundary

A breeding system must distinguish the artifact being evolved from the system that judges it. The model may propose descendants, but it must not edit the evaluator, policy, holdout data, promotion threshold, or release alias.

This boundary prevents the self-referential validation loop: optimizing the test instead of the task. It is the difference between adaptive engineering and metric capture.

Permission domains

DomainMay changeMust not change
Candidate modelweights, adapters, quantization, package metadata under sandboxevaluator, policy, registry aliases
Candidate code patchproposed implementation branchproduction deploy target, approval record
Evaluatortest logic under human/code reviewcandidate artifacts being scored
Registryimmutable package recordshistorical digests
Release controllertraffic weights for approved artifactsevaluation results
Governancepolicy thresholds and authoritiescandidate runtime state

Sandbox pattern

pseudocode
PROCEDURE evaluate_candidate(candidate_id, suite_id)
    candidate <- REGISTRY_FETCH_IMMUTABLE(candidate_id)
    suite <- EVALUATION_REGISTRY_FETCH_IMMUTABLE(suite_id)

    sandbox <- CREATE_SANDBOX(
        network = "disabled",
        writable_paths = ["/tmp/eval-output"],
        readable_paths = [candidate.path, suite.public_inputs],
        secrets = []
    )

    result <- RUN_EVALUATION(sandbox, candidate, suite)
    signed_result <- SIGN_EVALUATOR_OUTPUT(result)
    APPEND_SCORECARD(candidate_id, suite_id, signed_result)
END PROCEDURE

Human review triggers

Require explicit owner approval when a change touches evaluator logic, security invariants, authority boundaries, data-retention policy, release automation, or any model that can affect high-risk decisions. Candidate generation can be automated; authority expansion cannot.

Red flags

  • A candidate needs access to hidden labels.
  • A model requests permission to change its acceptance criteria.
  • A router learns only from accepted outputs and starves minority specialists.
  • A release controller lacks rollback to an immutable previous version.
  • Evaluation data and training data share a writable store.

Practical starting rule

Separate directories, credentials, and class responsibilities even in a single-process prototype. Physical separation can come later; logical separation should exist from the first experiment.

Source reports used for this guide

These reports are preserved verbatim in the site archive. The guide above is an editorial synthesis and may narrow, qualify, or reorganize claims from the source material.