Theory Advanced 2 minute read Updated 2026-06-26 UTC

Ecological fitness

How to measure a candidate by its marginal value to the population rather than its isolated benchmark rank.

Research statusConceptual synthesis Publication statePublished Reviewed byMichael Kappel Source reports3

Individual fitness versus ecological fitness

A model can score well in isolation and still make the system worse. It may duplicate an existing specialist, correlate with existing failures, add tail latency, or confuse the router. Conversely, a slightly weaker candidate can be valuable if it covers a rare niche, fails differently from the champion, or provides a low-cost fallback.

Ecological fitness is marginal contribution under current constraints. It asks what changes when the candidate joins the population.

Four contribution types

ContributionQuestionMeasurement hint
CoverageWhat tasks become newly solvable or better served?Slice-level gains and task inventory changes
ComplementarityWhere does this candidate succeed when others fail?Error decorrelation and disagreement audits
EfficiencyDoes it reduce average or tail cost?Router savings, early exits, memory residency
ResilienceDoes it improve recovery after failure or drift?Population removal tests and incident simulations

Marginal contribution estimate

A full Shapley-style contribution estimate is expensive because it requires testing many possible coalitions. Use approximate coalition testing for production decisions.

pseudocode
FUNCTION approximate_marginal_value(candidate, population, test_slices, budget)
    score <- 0
    sampled_coalitions <- SAMPLE_COALITIONS(population, budget)

    FOR coalition IN sampled_coalitions
        without <- EVALUATE(coalition, test_slices)
        with_candidate <- EVALUATE(coalition + candidate, test_slices)
        score += with_candidate.viability - without.viability
    END FOR

    RETURN score / COUNT(sampled_coalitions)
END FUNCTION

Error correlation matters

Two models with different names, weights, and benchmark cards may still fail on the same examples because they share data, architecture, training recipes, or synthetic teachers. A robust ecology needs behavioral diversity, not cosmetic diversity.

pseudocode
FUNCTION error_correlation(model_a, model_b, test_cases)
    paired <- []
    FOR case IN test_cases
        paired.APPEND([FAILED(model_a, case), FAILED(model_b, case)])
    END FOR
    RETURN CORRELATION(paired)
END FUNCTION

High correlation reduces ecological value. A candidate that fails differently can be useful even if its average score is lower.

Fitness is route-dependent

A model's value depends on whether the router can find it at the right time. If a specialist is never selected, it consumes memory without producing value. If it is selected too often, it can starve better modules. Evaluate the candidate and the routing policy together.

Retirement can improve fitness

Population fitness can increase when a module is deleted. Deletion reduces cognitive and operational burden, narrows the security surface, lowers memory pressure, and improves router clarity. A breeding system must therefore treat retirement as a positive structural operator, not as an admission of failure.

Source reports used for this guide

These reports are preserved verbatim in the site archive. The guide above is an editorial synthesis and may narrow, qualify, or reorganize claims from the source material.