Browser and edge runtime architecture

Why the browser matters

Browser and edge runtimes make the Four Fs concrete. They force frugality because memory is limited. They reward speed because users feel latency immediately. They require flexibility because device capabilities vary. They support federation because private local data can remain local.

A browser model-breeding system should not try to load a large generalist and then improvise. It should treat each skill as a package with a declared footprint, contract, runtime backend, and fallback path.

Runtime layers

Layer	Responsibility	Typical artifact
Loader	fetch, verify, cache, and unload packages	manifest + digest
Runtime	execute model or adapter	WASM, WebGPU, ONNX, GGUF-compatible engine
Router	select the smallest adequate skill	policy + score table
Budget	track memory, latency, and battery cost	local resource ledger
Evaluator	detect confidence failure or unsafe output	local validator or remote review path
Sync	share only approved summaries or updates	federated/distilled records

Package-first execution

A skill package needs enough metadata for the runtime to say no before loading it. The manifest should include size, required backend, expected latency class, input contract, output contract, risk tier, and unload conditions.

pseudocode

FUNCTION load_skill_if_affordable(skill_manifest, device_state, policy)
    VERIFY_SIGNATURE(skill_manifest)
    VERIFY_DIGEST(skill_manifest.weights)

    IF skill_manifest.required_backend NOT_IN device_state.backends
        RETURN REJECT("backend unavailable")

    projected_memory <- device_state.loaded_bytes + skill_manifest.bytes
    IF projected_memory > policy.memory_ceiling
        RETURN REJECT("memory ceiling")

    IF skill_manifest.risk_tier > policy.allowed_risk_tier
        RETURN REJECT("risk tier")

    LOAD_WEIGHTS(skill_manifest)
    REGISTER_UNLOAD_RULE(skill_manifest.unload_condition)
    RETURN READY(skill_manifest.id)
END FUNCTION

Quantization as a design primitive

Quantization is not merely compression after training. It is an architectural primitive because it decides which combinations can coexist on a device. A slightly weaker quantized specialist may be more valuable than a stronger model that prevents the router from loading complementary skills.

Privacy boundary

Local execution does not automatically make a system private. Telemetry, cache keys, prompts, embeddings, and sync payloads can leak. Treat every outbound event as a data product that needs minimization, purpose limitation, and user-visible policy.

Failure modes

The main edge failures are memory spikes, cold-start latency, stale cached packages, mismatched tokenizers, unsupported operators, and silent evaluator bypass when the device is offline. The safe default is an explicit fallback, not an ungoverned best effort.

Source reports used for this guide

These reports are preserved verbatim in the site archive. The guide above is an editorial synthesis and may narrow, qualify, or reorganize claims from the source material.

Architecture and edge systemsThe Teleodynamic Convergence: The 4Fs of AI, Code Beading, and the Evolution of Mutable Small ModelsMixed maturity · 46.7 KB Core synthesisThe 4Fs Framework: Fast, Flexible, Frugal, FederatedEmerging practice · 22.5 KB Core synthesisThe Architecture of Adaptability: An Exhaustive Analysis of the 4Fs, Code Beading, Model Breeding, and Interchangeable SystemsMixed maturity · 49.7 KB

Why the browser matters

Runtime layers

Package-first execution

Quantization as a design primitive

Privacy boundary

Failure modes

Source reports used for this guide

Related guides

Edge, cloud, and federated deployment

Architecture

Reference architecture

Skill package manifests