Objective
Triage incoming documents into workflow queues while extracting structured fields, identifying risk, and abstaining when confidence is insufficient. The system evolves by adding specialists for new document families and retiring redundant ones.
Capability packages
- format and language detector;
- document-family classifier;
- field extraction specialists;
- risk and policy classifier;
- confidence calibrator;
- deterministic schema validator;
- human-review queue adapter.
Request plan
FUNCTION triage(document)
metadata <- DETECT_FORMAT_LANGUAGE_AND_DATA_CLASS(document)
family <- CLASSIFY_DOCUMENT_FAMILY(document, metadata)
IF family.confidence < family_threshold
RETURN HUMAN_REVIEW("unknown family")
END IF
extractor <- ROUTE_TO_VERIFIED_EXTRACTOR(family.label, metadata)
fields <- extractor.EXTRACT(document)
validation <- VALIDATE_SCHEMA_AND_CROSS_FIELDS(fields)
risk <- RISK_SPECIALIST_CLASSIFY(document, fields)
IF NOT validation.pass OR risk.requires_human
RETURN HUMAN_REVIEW_WITH_EVIDENCE(fields, risk, validation)
END IF
RETURN ROUTE_WORKFLOW(fields, risk)
END FUNCTIONBreeding strategy
Cluster human-review cases by failure signature. Create a new specialist only when a stable, sufficiently large niche exists. For small temporary clusters, improve routing, retrieval, or deterministic rules instead.
Candidate operators include adapter training, distillation from a larger extraction model, taxonomy update, confidence recalibration, and quantization for high-volume families.
Evaluation
Use exact field accuracy, critical-field recall, family confusion, calibration, abstention, human-review volume, latency, cost, and downstream workflow error. Maintain temporal holdouts to test new document templates.
Safety and governance
Sensitive documents stay within approved jurisdictions. Models return structured outputs only. Deterministic validators enforce formats and cross-field constraints. High-risk or low-confidence outcomes require human review; the model cannot suppress that requirement.
Population management
Retire specialists when their document family disappears, merge those with indistinguishable behavior, and preserve rollback for taxonomies and extractors. Monitor router starvation and class drift.
Source reports used for this guide
These reports are preserved verbatim in the site archive. The guide above is an editorial synthesis and may narrow, qualify, or reorganize claims from the source material.