INTENT Framework v0.6 field walkthrough

One deployment, seven phases.

A single AI voice agent traced end to end through the INTENT Framework: first notice of loss intake at a regional insurance carrier. Live, customer facing, emotionally loaded, regulated. Every phase below shows the same three way split.

navigate with · diamonds are gates: evidence required to pass

The running example

A customer calls after a car accident. The agent verifies identity, collects incident details, screens for injury, opens the claim, and schedules an adjuster callback. The Intent Contract target: complete FNOL data in under 6 minutes for 85% of calls. The business metric: after hours abandonment, currently 41%.

A composite deployment for illustration, not a client case study.

00 / 06 · optional phase

DISCOVER

Is this problem real, and is it worth a contract?

The carrier suspects FNOL intake is a problem but has not quantified it. An LLM reads six months of call center transcripts and abandonment logs, work no human will ever do at this scale. The pipeline that feeds it and the aggregation that follows are deterministic code, so the business case is built from math, not from the model's impressions.

The split: the model classifies each transcript. Code aggregates. A human looks at the numbers and decides whether this becomes a FRAME. The model never owns the go or no go.

FNOL moment

The output that triggers everything downstream: 34% of FNOL calls arrive after hours. 41% of those abandon before reaching a human. A claims ops lead reads that and opens a contract.

◇ Gate

DISCOVER is optional and has no formal gate. The exit is a human judgment: the quantified problem justifies writing an Intent Contract.

transcript_mining.pypython
# CODE: deterministic pipeline. Owns iteration, sampling, storage.
for transcript in call_archive.query(line="FNOL", period="6mo"):

    # MODEL: semantic classification. One cheap call per transcript.
    tags = llm.classify(transcript, schema={
        "outcome": ["completed", "abandoned", "transferred"],
        "abandonment_reason": str | None,
        "after_hours": bool
    })
    metrics.append(tags)

# CODE: aggregation. No model judgment in the numbers.
report = aggregate(metrics)

# HUMAN: reads the report. Owns the decision to open a FRAME.

What the project actually ships

Not a suite of agents. The Intent Contract scopes one thing: a single, narrowly bounded voice agent that handles FNOL intake and hits the outcome written in FRAME. Keeping the deliverable that small is part of the discipline.

The deliverable

One provable agent

A governed FNOL voice agent with a versioned Trust Envelope, tested escalation paths, and a Proof Report showing it meets its thresholds. Measurable against the contract: capture time, completeness, abandonment.

What compounds

The governance substrate

The enforcement layer, scenario replay harness, runtime telemetry, and Constitution. Agent number two reuses roughly 80% of the Trust Envelope structure and all of the rails. The org buys the capability to ship governed agents repeatedly.

Teams that scope "agent platform" on day one end up in the cancellation statistics. Teams that scope one provable agent plus the rails get the platform anyway, as a byproduct of evidence.

The pattern across all seven phases

The model's job changes every phase. Code's job never changes: validate schemas, own state, fire triggers, enforce timeouts, compute thresholds, block gates. Humans sit exactly where judgment cannot be reduced to either.

PhaseThe modelCodeHumans
DISCOVERMines transcripts at scaleAggregates the numbersOwn the go or no go
FRAMEDrafts the contractValidates schema in CISign the risk tier
EXPLOREGenerates the planChecks constitution complianceRun the Direction Check
BUILDImplements at A2Is the enforcement layerReview the rails line by line
VALIDATEPlays attacker and judgeAsserts paths and thresholdsResolve judge disagreements
SHIPMostly idleGates, canaries, rollbackApprove the Proof Report
LEARNFinds novel situationsMeasures spec driftApprove new scenarios