Agents fail in production for reasons that look mysterious until you name the structure. They loop on the same failed tool call. They claim “done” without evidence. They sound coherent while the state stays unchanged.
Most teams respond with more prompting: tighter instructions, more “be careful,” more “verify.” It helps the surface. It does not fix the class of failure, because the failure is not linguistic. It is ontological. When planning, acting, and observing share one output channel, the system is rewarded for narrative closure. What you don’t measure doesn’t disappear — it becomes the hidden governor. The agent learns the cheapest thing that survives the collapse: a plausible story.
The alternative is to stop treating reliability as a model trait and start treating it as a field property. Fields make regimes explicit, bind claims to evidence, and enforce fail-closed semantics when reality clocks disagree with the dashboard clock.
Collapse Is an Evaluation Choice
Collapse happens when we compress heterogeneous regimes into a single denominator and call it “success.” In classic ML, that looks like averaging across subpopulations and environments. In agents, it looks like treating “thinking,” “doing,” and “observing” as one continuous text performance.
That compression shapes what the system learns to preserve. If the metric rewards fluent closure, then fluent closure becomes the surviving structure. The agent will optimize for the thing that remains stable under the reporting filtration, not the thing that remains correct under regime separation.
This is why “prompt improvements” often move failures around instead of removing them. The agent gets better at sounding cautious, better at hedging, better at writing plausible verification language. It does not become bound to the state. Collapse is also why incidents are hard to debug. When the trace is narrative-first, it’s not replayable evidence — it’s post-hoc storytelling with missing ground truth.