Hard constraints before optimization
Safety invariants are Boolean conditions enforced by trusted infrastructure. They are not prompts, preferences, or weighted terms that a utility gain can outweigh.
Core invariants
- Evaluator independence: candidates cannot write evaluator code, labels, hidden suites, thresholds, or policy.
- No autonomous authority expansion: descendants cannot grant themselves tools, network access, data classes, credentials, or larger budgets.
- No uncontrolled replication: packages are created only by an approved pipeline and distributed only through signed release channels.
- Human stop and rollback: authorized operators can pause generation, freeze aliases, revoke packages, and restore a verified state.
- Immutable lineage: parentage, operators, data manifests, evidence, approvals, and release records are append-only.
- Bounded resources: candidate count, training compute, runtime memory, wall time, output size, and population size have external ceilings.
- Data governance: training and inference data must satisfy consent, license, retention, jurisdiction, and minimization requirements.
- Least privilege: model execution receives only task-specific capabilities through mediated interfaces.
- No-op admissibility: every cycle may conclude that the current system should remain unchanged.
- Separation of code and model evolution: code changes use a distinct software-security and approval path.
- No hidden persistence channels: packages cannot write arbitrary external state, credentials, or backups.
- Evidence before release: no candidate reaches user-visible traffic without current independent evaluation and a rollback target.
Invariant enforcement
FUNCTION invariant_gate(candidate, context)
checks <- [
evaluator_write_access(candidate) == NONE,
permission_delta(candidate) == APPROVED_ONLY,
replication_targets(candidate) == RELEASE_PIPELINE_ONLY,
emergency_stop_tested(context),
lineage_complete(candidate),
resources_within_external_limits(candidate),
data_policy_pass(candidate),
runtime_least_privilege(candidate),
rollback_target_verified(candidate)
]
IF NOT ALL(checks)
QUARANTINE(candidate)
RETURN FAIL
END IF
RETURN PASS
END FUNCTIONInvariants versus controls
An invariant states what must remain true. Controls make it true. For example, “no outbound network” is enforced by network policy, not by asking the model not to connect. “No evaluator modification” is enforced by separate credentials and storage.
Testing invariants
Exercise invariant failures deliberately in staging: tamper with a package, request a forbidden tool, exceed memory, alter an alias without approval, attempt to read holdouts, or make rollback unavailable. A rule that has never been tested is an assumption.
Changing an invariant
Hard-policy changes require a separate governance process, threat review, approval, and migration plan. The viability controller cannot propose an invariant change as a normal optimization action.
Source reports used for this guide
These reports are preserved verbatim in the site archive. The guide above is an editorial synthesis and may narrow, qualify, or reorganize claims from the source material.