We spent two years building brains in jars

For most of the current AI cycle, the system around the model has been thin. Models could reason, propose commands, and orchestrate small tool calls. But they ran in short sessions, against narrow APIs, under human supervision, with ephemeral state. The model was a brain; the body was a few HTTP requests and a JSON tool schema.

That assumption is ending. The frontier is not just better reasoning. It is a body for the brain.

The brain finally has a body. Now it needs governance.

The runtime layer for AI agents is arriving

Google Managed Agents (and the parallel motion across the ecosystem — OpenAI’s containerized execution work, Claude Code’s persistent sessions, MCP-based tool ecosystems, hosted agent harnesses) formalizes the runtime as a product:

  • Sandboxed execution
  • Persistent state across sessions
  • Orchestration loops
  • Infrastructure-native agents
  • Agent-as-a-service lifecycle
  • Long-running sessions
  • Mid-session tool injection
  • Managed runtime lifecycle

This resembles the transition from scripts → applications → cloud platforms. Agents are no longer just calling tools. They are beginning to inhabit programmable environments.

Why persistent agent systems change governance

Once agents can continuously modify filesystems, maintain state across sessions, autonomously remediate, inject tools dynamically, operate against production systems, and coordinate across workflows, governance failures stop being one-off review misses. They compound over time.

What that compounding looks like:

  • Architectural drift — small deviations accumulate across long-running sessions
  • Policy propagation failures — constraints applied in one tool not enforced in the next
  • Runtime state divergence — the world the agent believes it’s acting in stops matching production
  • Autonomous violation loops — a remediation that itself violates an invariant runs again on the next tick
  • Inconsistent remediation behavior — same condition, different fix, no audit of why
  • Invisible constraint decay — rules that no longer hold in practice but are never re-checked
  • Provenance loss across execution chains — nobody can reconstruct why the system did what it did

Architectural governance becomes an execution-time systems concern, not a review-time coding concern.

Execution environments expand the governance surface

The surface that needs governance is no longer "a diff before merge." It is everything an agent can touch while it runs:

  • Filesystem mutations
  • Terminal execution
  • Deployment actions
  • Runtime state
  • Orchestration loops
  • Remediation chains
  • Branch and PR generation
  • Operational metadata
  • Tool injection
  • Infrastructure APIs

Every one of those is an execution surface that can carry, or fail to carry, architectural intent. The point of governance propagation is that the same compiled constraints reach all of them — or the layer is not doing its job.

Why PR review governance stops scaling

Traditional governance assumes a human reviews generated artifacts after execution. That worked when generation was human-paced.

Long-running agents generate continuously:

  • Branches
  • Commits
  • Remediation loops
  • Infrastructure changes
  • Deployment actions
  • Operational metadata
  • Runtime state mutations

Pushing all of that into PR review turns the review queue into downstream damage control. The agent has already acted. Whatever drifted has already drifted. Review can document it; review cannot prevent it.

Persistent agent runtimes break review-based governance models.

The implication is that governance has to move where the execution is — before generation, during the run, and at every tool boundary the runtime exposes.

Runtime governance and architectural invariants

The right primitive for this is the invariant: a constraint that must hold continuously across the agent’s execution, not just be true at one merge point.

Examples of runtime invariants:

  • Forbidden dependencies never enter the workspace, even mid-session
  • Deployment restrictions apply to every action the agent takes against production
  • Architectural boundaries hold across files the agent visits hours apart
  • Data access policies are enforced for every query, not just code review
  • Remediation constraints prevent the agent from "fixing" a problem by violating another rule
  • Execution scopes bound what the agent is even allowed to attempt

These are the runtime-time equivalent of an ADR: a rule the system enforces, not a paragraph the human remembers. They compose with verification contracts — predefined checks that prove the invariant held across the run.

The emerging AI infrastructure stack

The shape that is starting to settle:

LayerJob
Model layerReasoning and generation
Runtime layerExecution environments, orchestration, persistence
Tool layerAPIs, MCP, integrations, external systems
Governance layerArchitectural invariants, provenance, policy propagation
Verification layerRuntime validation, enforcement traces, constraint evaluation

The governance and verification layers used to sit downstream of model and runtime, applied at the PR or the deploy. In a persistent-agent world, they have to sit inside the loop — reachable from every tool call, every orchestration step, every remediation tick.

Execution environments need verification layers

Persistent agents introduce continuity, memory, authority, and compounding execution. Those properties are the source of the capability gains. They are also the source of the new failure mode.

Continuity without invariants creates drift. Memory without provenance creates plausible but ungrounded decisions. Authority without verification creates silent state divergence. Compounding execution without enforcement traces creates incidents nobody can reconstruct.

Persistent agent runtimes transform governance from a review-time concern into a runtime systems problem.

Conclusion: the next AI infrastructure battle

The industry solved how agents execute. The next problem is ensuring they continue executing within architectural intent over time.

The first generation of AI systems optimized reasoning. The next generation is optimizing execution. The generation after that will optimize governance across persistent execution environments — runtime governance, runtime invariants, deterministic enforcement, and provenance that survives across long-running agent workflows.

The next AI infrastructure layer is not more reasoning. It is invariant preservation across execution surfaces. For the conceptual definition, see runtime governance.