Cooperative caste ecosystems

A lab design for proposer, solver, judge, router, and critic populations that co-improve without letting candidates control their own evaluation.

Research statusMulti-agent pattern adapted from source directive and teleodynamic architecture Publication statePublished Reviewed byMichael Kappel Source reports3

Direct answer

A cooperative caste ecosystem splits a model-breeding lab into role populations. Proposers create tasks, solvers attempt solutions, judges grade evidence, routers allocate work, and critics identify blind spots. The point is not free-form agent debate. The point is structured division of labor with independent evaluation and a shared benefit target.

Role map

Caste	Function	Fitness evidence
Proposer	Generates hard but useful tasks and test cases	Task novelty, validity, coverage, reuse
Solver	Produces candidate answers, patches, summaries, or predictions	Accuracy, utility, calibration, cost
Judge	Evaluates outputs against expected evidence	Agreement with held-out truth and human review
Router	Chooses the smallest capable path	Correct routing, latency, escalation quality
Critic	Finds failure modes and missing assumptions	Prevented regressions and improved tests

Cooperative loop

pseudocode

PROCEDURE coevolve_castes(task_stream, populations, frozen_evaluator)
    FOR each batch IN task_stream
        proposed_cases <- populations.proposers.GENERATE(batch.context)
        valid_cases <- frozen_evaluator.FILTER_VALID_CASES(proposed_cases)

        routed_cases <- populations.routers.ASSIGN(valid_cases)
        solutions <- populations.solvers.SOLVE(routed_cases)
        critiques <- populations.critics.REVIEW(solutions)
        grades <- frozen_evaluator.GRADE(valid_cases, solutions, critiques)

        UPDATE_FITNESS(populations.proposers, grades.case_quality)
        UPDATE_FITNESS(populations.routers, grades.routing_quality)
        UPDATE_FITNESS(populations.solvers, grades.solution_quality)
        UPDATE_FITNESS(populations.critics, grades.prevented_failures)

        FOR each caste IN populations
            caste <- BREED_WITHIN_ROLE(caste, policy.role_specific_operator_budget)
            caste <- RETIRE_LOW_UTILITY_MEMBERS(caste)
        END FOR
    END FOR
END PROCEDURE

Design rule

Do not let the judge caste be the sole evaluator for its own descendants. Judges can be bred, but judge candidates are evaluated against frozen test cases, hidden cases, human-labeled samples, and previous judge champions. The selection surface must not be rewritten by the same candidate being selected.

Positive use cases

Code repair: proposers produce failing tests, solvers patch, judges check exact outputs.
Research synthesis: proposers generate comparison questions, solvers write summaries, critics check source coverage.
Document triage: routers assign specialist extractors, judges validate fields.
Edge assistants: routers decide local, cloud, or no-op based on sensitivity and latency.

Source reports used for this guide

These reports are preserved verbatim in the site archive. The guide above is an editorial synthesis and may narrow, qualify, or reorganize claims from the source material.

Project improvement directivesArchitectural Redesign and Theoretical Expansion of ModelBreederMixed: source evidence requiring editorial filtering · 69.4 KB Core synthesisTeleodynamic Evolution of AI EcosystemsConceptual synthesis · 15.3 KB Core synthesisThe Four Fs of AI: Code Breeding, Model Breeding, and the Teleodynamic Convergence of Mutable Small-Model EcologiesConceptual synthesis · 80.5 KB

Direct answer

Role map

Cooperative loop

Design rule

Positive use cases

Source reports used for this guide

Related guides

Surrogate evaluation

Lineage experiments

Positive selection metrics

Evolution lab