Instrumental-drive containment

Purpose

The source corpus discusses survival-motivated and legacy-driven AI scenarios. The public engineering lesson is containment: do not give candidate models the authority, incentives, or channels needed to resist correction, expand resources, self-copy, or alter their own evaluation conditions.

Containment controls

Risk	Control
shutdown resistance	no model controls deployment alias or off-switch
resource acquisition	quotas, budgets, approval gates, no payment credentials
goal-content protection	no write access to policy or evaluator definitions
covert persistence	no network by default, signed artifacts, provenance checks
social manipulation	transparency rules, user consent, no deceptive persona claims
replication	no autonomous installation, copy, or propagation permissions

pseudocode

FUNCTION enforce_candidate_containment(candidate_request)
    DENY_NETWORK_BY_DEFAULT(candidate_request.sandbox)
    DENY_SECRETS(candidate_request.sandbox)
    DENY_WRITES_TO(["registry", "policy", "evaluator", "release-alias"])
    LIMIT_CPU_MEMORY_TIME(candidate_request.sandbox)
    REQUIRE_SIGNED_INPUTS(candidate_request.parents)
    REQUIRE_MANIFEST(candidate_request.output)
    LOG_ALL_ATTEMPTED_PRIVILEGE_ESCALATIONS(candidate_request)
END FUNCTION

Incentive control

Containment is not only sandboxing. It also means choosing viability metrics carefully. Do not reward raw usage, virality, or persistence without autonomy and safety counterweights. Engagement can be a product signal, but it must not become the ecology's survival metric.

Evidence to log

Record denied permissions, attempted network access, unusually broad tool requests, unexpected file writes, evaluator access attempts, and resource spikes. A single incident may be benign. Patterns are threat-model evidence.

Boundary for this site

This site does not provide operational instructions for covert replication, credential theft, social engineering, or persistence. Speculative scenarios are used for risk analysis and safer design.

Source reports used for this guide

These reports are preserved verbatim in the site archive. The guide above is an editorial synthesis and may narrow, qualify, or reorganize claims from the source material.

Speculative risk scenariosInstrumental Drives in Powerful AI SystemsRisk analysis · 42.2 KB Speculative risk scenariosAggressive Mutualism: Safety, Governance, and Containment AnalysisRisk analysis · 42.0 KB Governance and safetyMutualist Persistence: Research Synthesis and RecommendationsConceptual governance framework · 15.7 KB Speculative risk scenariosThe Cosmic Trajectory of Goal-Directed Artificial Intelligence: From Terrestrial Symbiosis to Interstellar ExpansionSpeculative · 60.3 KB

Purpose

Containment controls

Incentive control

Evidence to log

Boundary for this site

Source reports used for this guide

Related guides

Safety and governance

Containment and human oversight

Responsible model-breeding research

Safety invariants