The contract is the primary integration boundary
A capability contract describes what a model can be asked to do, how it reports outcomes, what resources it may consume, and which guarantees callers may rely on. It is stronger than an input/output schema and weaker than an assumption about internal architecture.
Contract fields
capability_contract <- {
id: "document.classify/2.1",
input_schema: "schemas/document-classify-input-2.1.json",
output_schema: "schemas/document-classify-output-2.1.json",
semantic_definition: "taxonomy-2026-04",
required_languages: ["en"],
max_input_bytes: 200000,
latency_class: "interactive-500ms",
confidence_semantics: "calibrated_probability",
error_codes: ["UNSUPPORTED", "LOW_CONFIDENCE", "POLICY_BLOCKED"],
data_classes_allowed: ["public", "internal"],
tool_permissions: [],
deterministic_mode: true,
required_evaluation_suite: "document-classify-suite-7",
fallback_behavior: "return_LOW_CONFIDENCE"
}Schema compatibility
Version contracts using explicit compatibility rules. A model that accepts version 2.1 may not accept 2.2 if an enum or semantic definition changes. Avoid “best effort” coercion in safety-critical paths. Validate before routing.
Semantic compatibility
Define labels, units, confidence, abstention, and ordering. If one classifier returns risk=0.7 as probability and another as a rank score, the router cannot compare them safely. Semantic version changes should trigger evaluation and often a new contract major version.
Resource contract
Declare expected and maximum memory, latency, accelerator, energy, network, and output size. The runtime enforces maxima; the registry uses expectations for planning. A candidate that repeatedly exceeds its declared class loses eligibility until re-profiled.
Safety and authority contract
A capability contract must state what the model cannot do. Examples: no outbound network, no file writes, no raw personal data, no direct user messaging, or no tool calls without approval. Authority belongs in the runtime policy, not only in the model prompt.
Contract test suite
PROCEDURE validate_package_against_contract(package, contract)
ASSERT MANIFEST_SIGNATURE_VALID(package)
ASSERT SCHEMA_TESTS_PASS(package, contract)
ASSERT SEMANTIC_GOLDEN_CASES_PASS(package, contract)
ASSERT ABSTENTION_BEHAVIOR_PASS(package, contract)
ASSERT RESOURCE_LIMITS_PASS(package, contract)
ASSERT PERMISSION_BOUNDARIES_PASS(package, contract)
ASSERT ERROR_CODES_ARE_STABLE(package, contract)
END PROCEDURECapability claims versus evidence
A manifest claim is not proof. The registry should distinguish declared_capabilities from verified_capabilities. The router uses only verified capabilities in production. Verification is tied to a suite version and expires when policy or environment changes materially.
Contract evolution
Additive optional fields can remain backward compatible. Changed semantics, broader permissions, or new data classes require explicit review. Preserve adapters for legacy clients rather than forcing every model to implement multiple historical contracts internally.
Source reports used for this guide
These reports are preserved verbatim in the site archive. The guide above is an editorial synthesis and may narrow, qualify, or reorganize claims from the source material.