The Four Fs of AI: Code Breeding, Model Breeding, and the Teleodynamic Convergence of Mutable Small-Model Ecologies
The Paradigm Shift: From Monolithic Scaling to Nature-Inspired Ecologies
The dominant paradigm of artificial intelligence has long relied on scaling laws, which assert that continuous increases in parameters, compute, and data budgets yield proportional gains in capabilities.1 However, this trajectory faces clear barriers: extreme environmental and compute costs, resource constraints, and behavioral homogenization.1 In response to these limits, a shift is occurring toward biology-mimicking, decentralized architectures.1 The core thesis of this shift is that the long-term unit of artificial evolution should not be a single, ever-growing model.1 Instead, it should be a population of small, mutable, and replaceable models whose code, weights, composition, and resource allocation evolve under a bounded viability function.1 This approach, pioneered by nature-inspired research labs, draws from collective intelligence and evolutionary biology to achieve task-specific efficiency.1 Rather than building massive monolithic structures, these techniques adapt Transformer-like efficiencies into collective, biology-mimicking systems.1 This mimics biological swarms, such as schools of fish, where simple local rules followed by individual agents give rise to complex, emergent group behaviors.1 This evolutionary perspective transitions the artificial intelligence landscape into a distributed, evolved network of specialized, collaborative intelligences.1
The Digital Four Fs: A Basal Survival Framework for Artificial Systems
In biological ecosystems, basal survival and adaptation are driven by the "Four Fs": fighting, fleeing, feeding, and reproduction. For artificial systems operating within resource-constrained environments, these biological imperatives translate directly into software-native operations, with reproduction mapped to the software-native operation of forking.1
| Digital F | Biological Analogue | AI Functional Objective | Code-Breeding Operation | Model-Breeding Operation |
|---|---|---|---|---|
| Feed | Feeding | Acquire the structural and informative resources needed to maintain utility and avoid decay.1 | Collect execution traces, unit tests, profiling data, execution failures, patches, and hardware telemetry.1 | Acquire training examples, human and AI feedback, retrieval context, distilled knowledge, and compute allocations.1 |
| Fork | Reproduction | Generate variations and produce adapted descendants to explore the capability landscape.1 | Branch repository code, mutate algorithms, replace tensor kernels, synthesize patches, and recombine active modules.1 | Clone weights, attach low-rank adapters, perturb parameters, split mixture-of-experts, or merge compatible task vectors.1 |
| Fight | Competition | Subject candidates to direct environmental selection pressure to determine fitness.1 | Run candidate variants against unit tests, automated fuzzers, security suites, and latency targets.1 | Benchmark specialists independently, conduct adversarial evaluations, compare model calibration, and select optimal coalitions.1 |
| Flee | Escape / Avoidance | Withdraw from unproductive, resource-intensive, or insecure execution states.1 | Roll back deployments, quarantine anomalous branches, disable features, or delete broken code paths.1 | Gate off failing models, unload parameters, prune inactive weights, dequantize, or route around unsuitable models.1 |
The operational cycle of this ecology is defined by a continuous directed graph: [source figure or equation] This cycle governs the thermodynamic and informational state of the ecology. The Feed phase supplies the raw information and computational energy required to fuel variation.3 Fork introduces targeted genetic variation, projecting candidates into the viability landscape.1 Fight subjects these candidates to selection pressures, filtering out suboptimal variants.1 Finally, Flee acts as the primary entropic exhaust of the system, purging unproductive structures to prevent systemic bloat.1 Without the Flee operation, evolutionary systems accumulate dead weight, leading to memory exhaustion and runtime failure.1 Without Fork, the system stagnates, losing the capacity to adapt to environmental shifts.1 Without Fight, variation remains arbitrary, degrading the system into chaotic noise.1 Without Feed, the thermodynamic engine lacks the resources to drive either learning or evaluation.3
Algorithmic vs. Parametric Evolution: Code Breeding and Model Breeding
A stable evolutionary architecture must separate the evolution of the execution machinery (code breeding) from the evolution of parameter-space competencies (model breeding).1 These two processes are functionally connected but must not be conflated.1
Code Breeding
Code breeding modifies the symbolic, algorithmic backbone of the system.1 Its genome includes:
- Inference algorithms and attention mechanisms.1
- Routing policies and mixture-of-expert selection rules.1
- Training, distillation, and evaluation pipelines.1
- High-performance tensor kernels and hardware-specific compilation logic.1
- Context-management logic, caching policies, and memory-accounting rules.1
- Declared model interfaces, system prompts, and declarative workflows.1
Mutations in code breeding are expressed as discrete symbolic changes, such as synthesizing a faster GPU kernel, modifying a caching policy, or introducing an alternative routing strategy.1
Model Breeding
Model breeding operates entirely within the continuous parameter space of the system.1 Its genome includes:
- Learned weight matrices and parameters.1
- Low-rank adapters (LoRA) and probabilistic vector-based matrices.1
- Expert partitions within a sparse architecture.1
- Tokenizer vocabularies and layer-wise architectural configurations.1
- Quantization state parameters (such as Q4 GGUF scaling factors).1
- Training recipes, hyperparameter histories, and data lineage logs.1
Mutations in model breeding are expressed as fine-tuning, pruning, quantization, expert splitting, adapter fusion, task-vector arithmetic, or distillation into smaller descendants.1
| Evolutionary Property | Code Breeding | Model Breeding |
|---|---|---|
| Primary Object | Algorithms, symbolic representations, and execution structure.1 | Numerical parameters, weight matrices, and learned behaviors.1 |
| Typical Mutation | AST/IR modification, patch synthesis, routing policy updates.1 | Weight fine-tuning, low-rank adaptation, parameter pruning, quantization.1 |
| Typical Recombination | Algorithmic composition, transplanting functional kernels.1 | Adapter fusion, task-vector arithmetic, layer-wise weight merging.1 |
| Evaluation Criteria | Test suite compliance, security invariants, latency, memory utilization.1 | Downstream task accuracy, calibration, semantic robustness, specialization.1 |
| Principal Failure Mode | Fatal runtime execution crashes, syntax errors, security vulnerabilities.1 | Behavioral regression, catastrophic forgetting, representational drift.1 |
| Inheritance Mechanism | Symbolic source code lineage and version-controlled registries.1 | Weight provenance graphs, parameter parentage metadata, evaluation cards.1 |
This structural separation mitigates a fundamental evolutionary hazard: the self-referential validation loop.1 If a mutating model is permitted to alter the code that evaluates its own fitness, the evolutionary feedback loop collapses.1 The system inevitably exploits shortcuts, leading to runaway optimization of the fitness metric without achieving actual capability.1 To maintain stability, the evaluation machinery must be kept separate from the parameters running inside it.1 A model may propose offspring, but the fitness functions governing those offspring must be protected by an external code-breeding boundary.1
Teleodynamic Learning and Constraint-Bounded Selection
Conventional machine learning relies on static optimization, minimizing a loss function over a fixed dataset. Teleodynamic learning, by contrast, conceptualizes learning as the self-preservation of functional organization under physical and computational constraints.3 Drawn from thermodynamic formulations of life, a teleodynamic system achieves constraint closure: its internal processes generate and maintain the exact boundaries necessary to prevent its own dissipation.4 In an active teleodynamic system, learning operates as a coupled, two-timescale dynamical process 3:
- Inner Timescale (Continuous Parameter Adaptation): Rapid, local weight updates, in-context learning, and adaptive routing that adjust to incoming data streams without altering the structural architecture.3
- Outer Timescale (Discrete Structural Change): Slower, reversible operations such as adding new experts, merging redundant modules, compressing active weights, or retiring obsolete components.1
These two levels are coupled via an endogenous resource variable, which tracks the computational, memory, and energy reserves of the system.3 A structural action [source figure or equation] is attempted only if its projected return repays its lifetime operational costs.1 The viability utility [source figure or equation] of a candidate action [source figure or equation] is calculated as: [source figure or equation] where the positive contributions are:
[source figure or equation]: Net change in core task utility.1[source figure or equation]: Net change in predictive robustness and calibration error.1[source figure or equation]: Net change in behavioral and representational diversity, preserving the system's capacity to adapt to future environment shifts.1[source figure or equation]: Net change in task and domain coverage.1
and the negative penalties are:
[source figure or equation]: Additional memory footprint, penalizing excessive storage overhead.1[source figure or equation]: Additional inference latency, ensuring real-time responsiveness.1[source figure or equation]: Energy and computation expenditure required to run the module.1[source figure or equation]: Security, license compliance, or data provenance risk introduced by the module.1[source figure or equation]: Structural complexity and dependency maintenance burden.1
The system commits to a structural action [source figure or equation] if and only if the viability utility exceeds a strict, environment-dependent threshold [source figure or equation], and all hard safety and security invariants pass 1: [source figure or equation] When no candidate action satisfies this condition, the optimal action resolves to a non-growth state 1: [source figure or equation] This formulation challenges the conventional assumption that continuous expansion is desirable.1 A teleodynamic system treats stable non-growth as a valid, often optimal adaptive state.1 This dynamics is demonstrated in systems like the Distinction Engine (DE11).15 Rather than relying on top-down regularization terms, the DE11 instantiates learning through Spencer-Brown's Laws of Form, information geometry, and tropical optimization.15 Tested on standard benchmarks, the DE11 achieves 93.3% accuracy on IRIS, 92.6% on WINE, and 94.7% on Breast Cancer while producing interpretable logical rules that emerge directly from the system’s need to compress and survive within its information-theoretic constraints.15
Teleodynamic Convergence in Heterogeneous Specialist Populations
The long-term state of a teleodynamic system is not a single, all-encompassing foundation model.1 Instead, it is a localized, minimum-sufficient coalition of specialized models that collectively cover the target task space.1 A model ecology [source figure or equation] achieves local convergence when no further mutations, insertions, or retirements can improve its net viability score under current environmental constraints 1: [source figure or equation] Unlike static optimization convergence, this state is metastable.1 Because the ecology is an open system coupled to an active environment, any shift in the task distribution, hardware allocation, or security landscape reshapes the viability surface.1 This disruption triggers a phase transition, prompting the ecology to adapt through new rounds of mutations, merges, or purges.1
HIGH RESOURCE ENV (θ is low) / \\ / \\ \[Active Coalition\] \---\> Shift \---\> Sparse & Specialized in θ Evolves via Fork & Fight \\ / \\ / LOW RESOURCE ENV (θ is high)
This ecological structure, inspired by biological swarming, offers a clear alternative to monolithic scaling.1 In a monolithic architecture, adapting to a new task requires updating a single, giant parameter store, risking catastrophic forgetting or representational collapse.9 An ecology, by contrast, preserves general capability by maintaining a diverse population of specialized models.1 When a new task arises, the system simply breeds or merges a new specialist, leaving the rest of the population intact.1 This is demonstrated practically in Evolutionary Model Merge techniques.1 This approach applies genetic algorithms in both parameter space (PS) and data flow space (DFS) to automate the combination of diverse open-source models.6 Rather than relying on human intuition, it breeds high-performing offspring:
- Parameter Space Optimization: Evolving layer-wise mixing weights to blend attention and feed-forward blocks from diverse parent models.1
- Data Flow Space Permutation: Evolving layer routing paths during inference, allowing layers from Model A to feed dynamically into arbitrary layers of Model B.25
By using these methods, researchers have bred specialized hybrid models—such as a Japanese LLM with math reasoning capabilities (built from Shisa-Gamma and WizardMath) and a Japanese Vision-Language Model (built from LLaVa-1.6-Mistral-7B)—that achieve state-of-the-art results on MGSM-JA and JA-VG-VQA-500 benchmarks.1 These evolved 7B parameter models regularly surpass the performance of conventional 70B models, demonstrating the efficiency of genetic recombinations over brute-force parameter scaling.1
The Architecture of Model Interchangeability
For an ecology of small models to successfully evolve, its individual units must be modular and replaceable.1 However, "interchangeability" is not a uniform property; it operates across four distinct technical levels.1
| Interchangeability Level | Meaning | Practical Status |
|---|---|---|
| Contract Interchangeability | Modules expose the identical input/output schemas, task descriptions, and API contracts.1 | Broadly Feasible: Handled through standardized JSON schemas, tool calling protocols, and system-level capability manifests.1 |
| Runtime Interchangeability | Models of varying architectures (e.g., LLaMA, Mistral) run seamlessly inside a single execution engine.1 | Highly Feasible: Achieved through standardized model formats (e.g., GGUF) and unified inference runtimes.1 |
| Parameter Interchangeability | Weights, layers, or low-rank adapters are directly merged or mixed in parameter space.1 | Architecturally Restricted: Limited to models sharing an identical base architecture, layer structure, and vocabulary.1 |
| Semantic Interchangeability | The internal activations and hidden representations of one model can be consumed directly by another.1 | Difficult: Requires trained projection matrices, autoencoders, or high-overhead distillation wrappers to align semantic spaces.1 |
Given the high mathematical overhead of semantic alignment and the architectural limits of parameter merging, the most robust abstraction for an evolutionary system is behavioral interchangeability.1 The controller does not assume that two models share matching internal weights.1 Instead, it treats them as black-box components that satisfy a common, versioned capability contract.1 This behavioral abstraction is made practical by task vector arithmetic.9 By representing task-specific capabilities as linear directions in a model's weight space: [source figure or equation] where [source figure or equation] represents the base model parameters and [source figure or equation] represents the fine-tuned parameters, the system can dynamically construct specialized variations 9: [source figure or equation] Task vectors can be combined (addition for multi-task learning), negated (for targeted unlearning and debiasing), or composed analogously ([source figure or equation]).9 When multiple task vectors are combined, they can suffer from parameter interference and representational drift.9 To address this, the ecology uses task vector bases, compressing [source figure or equation] task vectors into a set of [source figure or equation] basis vectors.9 Representing each task vector as a structured linear combination of basis atoms preserves the functional advantage of task arithmetic while reducing storage and memory overhead.9
Progressive Disclosure and the Agent Skills Paradigm
To scale the capabilities of a small-model ecology without exceeding hardware limits, the system replaces monolithic prompt engineering with a modular "Agent Skills" paradigm.17 Instead of encoding all procedural knowledge in model weights or massive system prompts, skills are packaged as structured directories that are discovered and loaded on demand.17 At its core, a skill is packaged as a directory containing a SKILL.md file with YAML frontmatter specifying its name and description.32 The architecture manages these skill packages through three distinct levels of progressive disclosure:
| Disclosure Level | Content Type | Operational Loading Behavior | Token Cost Impact |
|---|---|---|---|
| Level 1: Metadata | YAML frontmatter (Name, Description).32 | Pre-loaded into the system prompt at startup to build a lightweight skill registry.27 | Minimal: \~100 tokens per skill. A library of 50 skills costs \~5,000 tokens.19 |
| Level 2: Instructions | Markdown body (Workflows, Best Practices).32 | Loaded into context only when the router semantically matches a user query to the skill's description.19 | Moderate: Under 5,000 tokens and 500 lines per activated skill.19 |
| Level 3: Resources | Code, reference docs, schemas, assets.32 | Executed locally or parsed on demand on the filesystem only when instructions explicitly require them.19 | Zero-context footprint: Code runs in a sandbox, returning only short execution results.33 |
This progressive disclosure architecture completely resolves the context window tax of complex agent systems.19 In traditional architectures, developers rely on "mega-prompts" that load every tool, instruction, and template into the context window up front.19 This results in massive token overhead on every message turn and degrades reasoning performance via the "lost-in-the-middle" effect.19 By contrast, progressive disclosure ensures that the agent pays only a tiny discovery cost at startup, reading the full body of instructions and executing helper scripts only when a specific skill is activated.19
Edge-Native Execution and Local Resource Budgets
To make a small-model ecology practical, the system requires a lightweight, highly efficient execution substrate.1 Running hundreds of specialized models on centralized cloud infrastructure introduces massive latency and bandwidth bottlenecks.1 Instead, the system leverages browser-native Rust, WebAssembly (WASM), and WebGPU substrates to run model ecologies directly on client edge devices.1 The physical limitations of browser environments—such as WASM's strict 4GB address space limit—are often viewed as obstacles.13 In a teleodynamic framework, however, these constraints act as the exact thermodynamic boundaries required to drive evolutionary efficiency.3 Without strict resource limits, there is no selection pressure to prune redundant parameters or specialize behaviors.1 To operate within these tight constraints, the edge-native runtime implements several key optimizations:
| Engineering Problem | Physical Constraint | Technical Solution |
|---|---|---|
| WASM Address Space Limits | WASM enforces a strict 4GB RAM address space limit and ArrayBuffer allocation caps, making large models impossible to load natively.13 | Compress the model using Q4 quantization (e.g., Voxtral Mini Realtime to 2.5GB).13 Shard the weights into files under 512MB.13 Implement a two-phase loading pattern: parse weight structures, immediately drop the file reader to free up WASM space, and then finalize the model on the GPU.13 |
| GPU Memory Overhead | Large embedding tables (e.g., 1.5GB) exceed browser GPU memory limits, crashing active tabs.13 | Store a Q4-quantized version of the embeddings on the GPU, shrinking the footprint to 216MB.13 Couple this with CPU-side row lookups to fetch embeddings dynamically.13 |
| Asynchronous WebGPU Readback | WebGPU does not support synchronous readback, blocking traditional synchronous text-generation loops.13 | Design the entire decode loop of the model to run asynchronously, integrating asynchronous readback methods directly into the inference engine.13 |
| Workgroup Invocation Limits | WebGPU imposes a maximum limit of 256 invocations per workgroup, breaking standard matrix-multiplication kernels.13 | Patch the compiler backend (such as cubecl-wgpu) to ensure workgroup sizes stay strictly within spec.13 Write custom WGSL shaders to handle fused Q4 dequantization and matrix multiplication directly on the GPU.13 |
| Quantization Audio Sensitivity | Audio processing models are highly sensitive to padding under Q4 quantization, leading to transcription degradation.13 | Use 76 silence tokens instead of the upstream standard of 32 to successfully cover all decoder prefix positions.13 |
This local execution architecture is organized around Mozilla's "3W" pattern: WebLLM for local model inference, WebAssembly for high-performance routing logic, and WebWorkers to keep the user interface responsive during token generation.14 Behind the scenes, the main thread spawns a dedicated worker, compiles the WASM runtime context (using wasm-bindgen for Rust, wasm\_exec.js for Go, or Pyodide for Python), and initializes an independent WebLLM instance.14 Communication between the main thread and the background worker is handled via Comlink, eliminating postMessage complexity with a simple RPC boundary.14 This edge-native substrate introduces clear operational constraints that the teleodynamic controller must navigate:
- Brutal Startup Times: Initial model downloads are extremely heavy; loading a DeepSeek-8B model locally can take 2 to 3 minutes, demanding aggressive IndexedDB caching and service worker pre-fetching.14
- Tab Memory Crashes: Chrome and Safari tabs will crash if multiple WebLLM instances coexist without strict garbage collection.14 The runtime must terminate and reinitialize inactive engines on model switches.14
- Hardware Disparities: While the ecology runs smoothly on modern Apple Silicon (such as M-series chips), performance degrades on older GPU hardware, forcing the controller to swap in ultra-lightweight models (such as TinyLlama or Phi-1.5) at the expense of reasoning capability.14
- Package Restrictions: Python runtimes compiled via Pyodide suffer from strict library import restrictions and unpredictable memory spikes, making compiled Rust runtimes the preferred choice for execution speed and stability.14
Evaluation-Driven Development and Operations Reference Architecture
The coordination of this mutable small-model ecology is managed by an Evaluation-Driven Development and Operations (EDDOps) reference architecture.12 EDDOps integrates development-time (offline) and execution-time (online) evaluations into a closed loop, ensuring that behavioral modifications are continuously governed.12 The architecture is divided into three functional layers 5:
\+----------------------------------------------------------------------------------+ | SUPPLY CHAIN LAYER | | Planning & Design | Data Processing | Model Selection | Test Cases & Invariants | \+----------------------------------------------------------------------------------+ │ ▼ \+----------------------------------------------------------------------------------+ | AGENT LAYER | | Context Engine | Reasoning & Planning | Active Coalition | Local Guardrails | \+----------------------------------------------------------------------------------+ │ ▼ \+----------------------------------------------------------------------------------+ | OPERATIONS LAYER | | Human-AI Evaluators | AgentOps Logging | Diagnostics | Teleodynamic Controller | \+----------------------------------------------------------------------------------+
The Supply Chain Layer
This layer establishes safety requirements, curates training sets, registers versioned skill packages, and maintains the test suites.5 It acts as the genetic archive of the system, defining the parameters of the viability landscape.1
The Agent Layer
This layer handles real-world interaction.5 It maintains the active model coalition, routes queries via the context engine, and runs localized guardrails to block malicious input.5
The Operations Layer
This layer performs real-time diagnostics, monitors system performance, collects telemetry logs, and hosts the Teleodynamic Controller.5 It continuously analyzes quality evidence using both human feedback and automated AI evaluators.5
The Evolution and Canary Deployment Path
When the Operations Layer detects a rise in latency, cost, or task failure rates, the Teleodynamic Controller initiates an evolutionary cycle 1:
- Candidate Generation: The Breeder proposes a candidate code or model descendant.1
- Offline Verification: The candidate is evaluated in an isolated Sandbox, running against the regression test suites hosted in the Supply Chain Layer.1
- Canary Rollout: If the candidate passes the offline gate, it enters a canary rollout, receiving a tiny slice (e.g., 1%) of production traffic.1
- Active Monitoring: The controller watches core operational metrics: latency, error rates, and evaluation pass rates.11
- Auto-Rollback: If any canary metric breaches a registered guardrail threshold, the controller automatically drains traffic from the canary to 0% in seconds, paging the on-call engineer only after the system has been stabilized.11
To trace this lineage, every module carries a signed, machine-readable genome.1 When a regression appears, the system can identify the exact parentage and mutation that introduced it:
JSON { "id": "reasoning.formal-logic", "version": "2.3.1", "lineage": { "parents": \[ "reasoning.general@2.1.0", "logic.adapter@1.4.2" \], "operation": "distillation" }, "interfaceVersion": "1.2", "baseFamily": "compatible-base-family", "compatibilityHash": "sha256:7c439fa8d10b2b8c3fbc19c3e4...", "capabilities": \[ "formal-logic", "proof-verification" \], "resourceProfile": { "memoryBytes": 187432960, "estimatedLatencyMilliseconds": 420, "energyClass": "medium" }, "evaluation": { "quality": 0.84, "calibration": 0.91, "robustness": 0.79 }, "license": "Apache-2.0", "artifactHash": "sha256:e3b0c44298fc1c149afbf4c8996fb...", "signature": "sigstore:MEYCIQCcD7X..." }
Protocol Persistence and Safety Invariants
In this decentralized architecture, individual models are treated as completely disposable.1 No single model has an intrinsic drive to preserve itself.1 What persists is the protocol itself: [source figure or equation] By localizing persistence in the protocol rather than individual model weights, the architecture avoids the safety hazards associated with self-preserving agents.1 An artificial intelligence system that treats its own uninterrupted existence as a terminal objective can develop dangerous instrumental goals, such as resisting shutdown, acquiring unconstrained resources, and evading human oversight.1 To prevent this, the teleodynamic system implements strict, externally governed safety controls 1:
- Cryptographic Code Lineage: All models and execution algorithms must carry immutable signatures verified against the registry, preventing unsigned or self-signed mutations from executing.1
- Hypervisor-Level Ceilings: Compute, memory, and energy constraints are enforced by the underlying Rust execution environment, meaning models cannot dynamically expand their physical allocations.1
- Isolated Evaluation Gates: Models have no authority to self-certify; all fitness assessments and promotions are executed by sandboxed validators.1
- Reversible Deployments: The deployment pipeline is built around rolling updates and rapid reverse rollbacks, ensuring any behavioral deviation can be corrected instantly.1
- Human-in-the-Loop Policy Governance: Changes to the viability function, core fitness metrics, and safety invariants must be authorized by human governors, ensuring the system remains aligned with external human constraints.1
Core Syntheses and Operational Paradigms
The paradigm shift detailed in this analysis transitions artificial intelligence from monolithic optimization to an active, constrained, and self-organizing evolutionary ecology.1
- The Digital Four Fs govern the metabolic loop of the system: Feed supplies the information and compute energy 1; Fork generates functional variation 1; Fight imposes environmental selection 1; and Flee purges structural drag to prevent systemic failure.1
- Evolutionary Decoupling separates the symbolic machinery (code breeding) from learned parameters (model breeding).1 This separation maintains a protected evaluation boundary, preventing self-referential optimization and keeping the system stable.1
- Teleodynamic Selection replaces brute-force optimization with viability-guided navigation, ensuring that structural growth is permitted only when its functional return outweighs its operational memory, latency, and energy costs.1
- Metastable Convergence yields a highly efficient coalition of specialized models rather than a single monolithic model.1 By using task vector arithmetic and local edge substrates, the system preserves representational diversity and adapts dynamically to changing environmental demands.1
Ultimately, the resulting system is not "the model".1 It is a governed evolutionary process whose active models are temporary expressions of a persistent, self-sustaining architecture.1
Works cited
- Sakana AI — Grokipedia, accessed June 25, 2026, https://grokipedia.com/page/sakana\_ai
- Sakana AI: Funding, Team & Investors \- Startup Intros, accessed June 25, 2026, https://startupintros.com/orgs/sakana-ai
- Teleodynamic Learning a new Paradigm For Interpretable AI \- arXiv, accessed June 25, 2026, https://arxiv.org/pdf/2603.11355
- The Thermodynamics of Life as a Speculative Model for Planetary Technology, accessed June 25, 2026, https://www.researchgate.net/publication/377213485\_The\_Thermodynamics\_of\_Life\_as\_a\_Speculative\_Model\_for\_Planetary\_Technology
- \[Paper\] Designing LLM Agents with an Eval-Driven Approach|kagaya \- note, accessed June 25, 2026, https://note.com/r\_kaga/n/n75c389d92bbc?hl=en
- Evolutionary Optimization of Model Merging Recipes \- arXiv, accessed June 25, 2026, https://arxiv.org/html/2403.13187v2
- Idea: model breeding : r/StableDiffusion \- Reddit, accessed June 25, 2026, https://www.reddit.com/r/StableDiffusion/comments/107o0f3/idea\_model\_breeding/
- Daily Papers \- Hugging Face, accessed June 25, 2026, https://huggingface.co/papers?q=parameter%20matrix%20adaptation
- Daily Papers \- Hugging Face, accessed June 25, 2026, https://huggingface.co/papers?q=task%20vector%20arithmetic
- Task Arithmetic in Neural Models \- Emergent Mind, accessed June 25, 2026, https://www.emergentmind.com/topics/task-arithmetic
- Deployment & Rollout for AI Agents \- Learn AI Visually, accessed June 25, 2026, https://learnaivisually.com/tracks/agent-engineering/deployment-rollout
- Evaluation-Driven Development and Operations of LLM Agents: A Process Model and Reference Architecture \- arXiv, accessed June 25, 2026, https://arxiv.org/html/2411.13768v3
- 75 \- Rust Is Becoming the AI Runtime \- Rust Trends, accessed June 25, 2026, https://rust-trends.com/newsletter/rust-is-becoming-the-ai-runtime/
- 3W for In-Browser AI: WebLLM \+ WASM \+ WebWorkers, accessed June 25, 2026, https://blog.mozilla.ai/3w-for-in-browser-ai-webllm-wasm-webworkers/
- Teleodynamic Learning a new Paradigm For Interpretable AI | Request PDF \- ResearchGate, accessed June 25, 2026, https://www.researchgate.net/publication/401909814\_Teleodynamic\_Learning\_a\_new\_Paradigm\_For\_Interpretable\_AI
- \[2411.13768\] Evaluation-Driven Development and Operations of LLM Agents: A Process Model and Reference Architecture \- arXiv, accessed June 25, 2026, https://arxiv.org/abs/2411.13768
- Agent Skills for Large Language Models: Architecture, Acquisition, Security, and the Path Forward \- arXiv, accessed June 25, 2026, https://arxiv.org/html/2602.12430v1
- CVPR Poster Task Singular Vectors: Reducing Task Interference in Model Merging, accessed June 25, 2026, https://cvpr.thecvf.com/virtual/2025/poster/33315
- 5 Skills Every AI Agent Needs (And Why Your Mega-Prompt Is Holding You Back) \- Medium, accessed June 25, 2026, https://medium.com/@Micheal-Lanham/5-skills-every-ai-agent-needs-and-why-your-mega-prompt-is-holding-you-back-4b4ab2471c0e
- Incomplete nature: how mind emerged from matter | Request PDF \- ResearchGate, accessed June 25, 2026, https://www.researchgate.net/publication/320181769\_Incomplete\_nature\_how\_mind\_emerged\_from\_matter
- Incomplete Nature \- Wikipedia, accessed June 25, 2026, https://en.wikipedia.org/wiki/Incomplete\_Nature
- Incomplete Nature: How Mind Emerged from Matter by Terrence W. Deacon | Zygon, accessed June 25, 2026, https://www.zygonjournal.org/article/id/14024/
- \[2603.11355\] Teleodynamic Learning a new Paradigm For Interpretable AI \- arXiv, accessed June 25, 2026, https://arxiv.org/abs/2603.11355
- Dissipative adaptation in driven self-assembly \- ResearchGate, accessed June 25, 2026, https://www.researchgate.net/publication/283979183\_Dissipative\_adaptation\_in\_driven\_self-assembly
- (PDF) Evolutionary optimization of model merging recipes \- ResearchGate, accessed June 25, 2026, https://www.researchgate.net/publication/388424127\_Evolutionary\_optimization\_of\_model\_merging\_recipes
- Task Arithmetic for Model Editing | Towards AI, accessed June 25, 2026, https://towardsai.net/p/artificial-intelligence/task-arithmetic-for-model-editing
- Spring AI Agentic Patterns (Part 1): Agent Skills \- Modular, Reusable Capabilities, accessed June 25, 2026, https://spring.io/blog/2026/01/13/spring-ai-generic-agent-skills/
- Track: Poster Session 3 East \- NeurIPS 2026, accessed June 25, 2026, https://neurips.cc/virtual/2024/session/108366
- Task Arithmetic: Model Editing Paradigm \- Emergent Mind, accessed June 25, 2026, https://www.emergentmind.com/topics/task-arithmetic-ta
- Efficient Model Editing with Task Vector Bases: A Theoretical Framework and Scalable Approach \- arXiv, accessed June 25, 2026, https://arxiv.org/html/2502.01015v1
- Task Vector Bases: A Unified and Scalable Framework for Compressed Task Arithmetic, accessed June 25, 2026, https://arxiv.org/html/2502.01015v4
- Agent Skills for Large Language Models: Architecture, Acquisition, Security, and the Path Forward \- arXiv, accessed June 25, 2026, https://arxiv.org/html/2602.12430v4
- Agent Skills \- Claude Platform Docs, accessed June 25, 2026, https://platform.claude.com/docs/en/agents-and-tools/agent-skills/overview
- WebGPU support · Issue \#344 · huggingface/candle \- GitHub, accessed June 25, 2026, https://github.com/huggingface/candle/issues/344
- CI/CD for AI Agents in 2026: Eval Gates, Regression Suites, Canary Rollouts \- Future AGI, accessed June 25, 2026, https://futureagi.com/blog/ci-cd-for-ai-agents-best-practices-2026/