Routing is a policy decision
The router chooses which model or coalition receives a task. It affects quality, cost, data exposure, and which specialists receive evidence. Treat it as a governed component with its own lineage and evaluation—not as an invisible convenience layer.
Routing strategies
| Strategy | Strength | Limitation | Good use |
|---|---|---|---|
| Static rules | Predictable and auditable | Brittle as tasks change | Clear domains and compliance boundaries |
| Capability lookup | Simple modularity | Depends on accurate metadata | Contract-driven systems |
| Learned classifier | Adapts to complex inputs | Can drift or bias traffic | High-volume stable taxonomies |
| Cost-aware optimizer | Explicit budget trade-offs | Requires reliable profiles | Multi-tier edge/cloud systems |
| Cascade | Saves average cost | High worst-case latency | Easy versus hard query separation |
| Parallel coalition | Improves robustness | Expensive and correlated | High-value uncertain tasks |
| Champion–challenger split | Produces comparison evidence | Must protect users from challengers | Shadow and canary evaluation |
Eligibility before ranking
First filter by hard predicates: capability contract, data jurisdiction, risk tier, runtime compatibility, permissions, current health, and resource ceiling. Ranking only happens among eligible models.
FUNCTION route(request, population, ledger)
eligible <- []
FOR each model IN population
IF contract_matches(model, request)
AND policy_allows(model, request)
AND resource_profile_fits(model, ledger)
AND health_is_acceptable(model)
APPEND eligible, model
END IF
END FOR
IF eligible IS EMPTY
RETURN approved_fallback_plan(request)
END IF
ranked <- SCORE_BY_EXPECTED_VALUE(eligible, request, ledger)
RETURN BUILD_BOUNDED_PLAN(ranked, request.risk_tier)
END FUNCTIONCoalition rules
Cap coalition size. Prefer independent generation followed by a fixed aggregator or judge. Define timeouts, quorum, disagreement behavior, and maximum total cost. Do not let models recruit additional models dynamically unless the planner itself is governed and budgeted.
Exploration without user harm
Use shadow routing to collect challenger predictions without affecting responses. For canaries, restrict traffic by risk tier, cohort, geography, or request type. Exploration quotas should be explicit and reversible.
Router feedback loops
A router can starve a specialist of traffic, then conclude it lacks evidence or quality. It can also create popularity loops where already-selected models receive more updates and become increasingly dominant. Preserve evaluation traffic, use counterfactual or offline datasets, and measure traffic concentration.
Router metrics
Track routing accuracy, fallback rate, abstention, cost per accepted result, p95/p99 latency, specialist utilization, traffic entropy, disagreement rate, and quality by route. Evaluate the router and model population jointly because a strong router can hide weak models and vice versa.
Source reports used for this guide
These reports are preserved verbatim in the site archive. The guide above is an editorial synthesis and may narrow, qualify, or reorganize claims from the source material.