The News Is Bigger Than the Project

On June 13, Databricks open-sourced Databricks’ Omnigent, a meta-harness that sits above individual coding harnesses like Claude Code and Codex. Built by Matei Zaharia, Kasey Uhlenhuth, and Corey Zumar and released under Apache 2.0, it adds agent composition, runtime contextual policies, sandboxing, and spend and permission controls — for example, requiring a human to approve a git push after an agent has downloaded a new package. It is a serious piece of infrastructure.

But the project itself is not the story. The story is what its existence signals: AI engineering is moving up a layer. For two years the work was about making a single agent more capable. The frontier is now about coordinating many agents safely. Omnigent is one of the first widely-shipped examples of that move, which makes it a clean lens on a question the whole industry is about to face — not can an agent act, but should it.

Phase 1: The Era of the Individual Agent

The first phase was the capable singleton. Claude Code, Codex, Cursor, and Gemini CLI all pursued the same goal: make one agent better at reading a codebase, planning a change, and writing code that compiles and passes tests. The improvements were real and they compounded quickly.

Capability, though, was never the same thing as alignment. A more powerful agent writes more code, faster — including code that quietly violates architectural intent. It can introduce a new dependency the team rejected two quarters ago, reach around a service boundary, or re-implement a pattern an ADR explicitly forbids. None of those are bugs the model can catch, because from the model’s point of view the change is locally correct. A smarter singleton does not close that gap. It widens it.

Phase 2: Agent Orchestration

The second phase answered a different question: how do multiple agents work together? LangGraph, the OpenAI Agents SDK, CrewAI, and AutoGen are orchestration frameworks. They decide which agent takes which task, how context flows between them, how state is shared, and how a multi-step workflow is managed when one agent hands off to the next.

This was a genuine advance, and it is where most teams building agentic systems live today. But orchestration is about execution mechanics. It coordinates how agents run. It is silent on what they are permitted to produce. A workflow can be perfectly orchestrated and still ship a change that breaks the system’s intended design, because nothing in the framework evaluates the output against the rules the organization actually cares about. The same gap from Phase 1 survives the jump to Phase 2 — it just now operates across several agents at once. We have written before about how coordinating multiple agents without an enforcement layer parallelizes inconsistency as fast as it parallelizes output.

Phase 3: The Meta-Harness Layer

Omnigent marks a third phase: a layer above the individual harnesses. A meta-harness composes agents, lets them collaborate through shared interfaces, and governs them at runtime with contextual permissions, sandboxing, and spend controls. The unit of attention moves from the individual agent to the agent ecosystem — a fleet of agents operating across a system, with shared infrastructure for running, isolating, and budgeting them.

This is the right direction, and it is where a real new category is forming. The emergence of meta-harnesses tracks the broader shape of the agent infrastructure stack taking form beneath the models. Runtime policy is part of governance. Requiring human approval before a sensitive action, sandboxing what an agent can touch, capping spend — these are real controls, and they matter.

They are also, almost entirely, controls about access and resources. They answer whether an agent can perform an action: can it reach the network, can it push, can it spend. They do not answer whether the action is allowed in the sense that matters most to an engineering organization.

The Missing Layer: Governance of Intent

The hard question in a multi-agent system is not “can this agent perform this action?” It is “should it?” Those are different questions, and only the first one is well served today.

  • Should this agent be allowed to modify billing code at all?
  • Should it introduce a new database dependency, or is the persistence layer fixed by decision?
  • Should it be permitted to bypass an ADR that the team ratified deliberately?
  • Should this change be merged when, in aggregate, it nudges the system toward architectural drift?

An orchestration framework will not answer these. A meta-harness with runtime permissions will not either, because the question is not about capability or resource limits — it is about intent. The decision that billing must route through one service, or that no new message broker enters the stack, is not a permission flag. It is an architectural rule, and it has to be checked against the actual proposed change, deterministically, before that change lands.

Access control asks “can it?” Governance asks “should it?” A meta-harness can stop an agent from pushing without approval. It cannot, on its own, stop an approved push from violating the architecture the team agreed to hold.

Why Agent Governance Becomes Inevitable

This is not a niche concern that a few regulated teams will eventually need. Four forces make it structural, and each one is already in motion.

More autonomy. Agents are taking on longer, more complex tasks with less human review per step. Every increment of autonomy is an increment of unsupervised decision-making, and decisions are exactly where intent gets honored or broken.

More surface area. Multiple agents now operate across multiple repositories at once. A change one agent makes in one service interacts with assumptions another agent is relying on elsewhere. The interactions are where consistency quietly fails.

More architectural risk. Small, locally-valid changes accumulate. No single commit looks wrong; the system as a whole slides away from its intended shape. This is the mechanism behind governance propagation — a single enforced decision has to reach every agent call, or drift compounds faster than any reviewer can track.

More accountability. Organizations increasingly need evidence: why did this action occur, who or what approved it, and which policy permitted it. As agents make more of the changes, “the model wrote it” stops being an acceptable answer. The trail of enforcement provenance — which rule allowed or blocked which change — becomes part of the engineering record.

From Security Policy to Engineering Policy

Today’s agent governance is mostly security governance: permissions, sandboxing, spend limits, approval gates. That layer is necessary and Omnigent does it well. But it is the first half of the problem, not the whole of it.

The governance that engineering organizations actually need extends past security into the substance of the software: architecture, engineering standards, ADR enforcement, dependency controls, and system boundaries. That is where the category has to evolve. A spend cap protects the budget. It does nothing for the service boundary. A sandbox limits the blast radius of a rogue process. It says nothing about whether a sanctioned change respects the data model. The two markets — runtime security governance and architectural governance — are related but not the same, and the second one is barely served.

This is also why the missing layer is not a feature to bolt onto a harness but a distinct concern: a control plane for agent behavior. If you are mapping where this lands in your own stack, the use cases for architectural governance make the boundary concrete — it is the system that evaluates proposed changes against the rules the organization has actually decided to hold.

The Emerging Stack

Lay the layers out in order and the shape of the missing piece becomes obvious. Each layer made agents more useful; each made the next one necessary.

LayerCoordinatesAnswers
ModelsGenerationCan it produce text and code?
AgentsTasksCan it complete a task end to end?
Agent frameworksExecutionHow do agents run together?
Meta-harnessesAgentsCan an agent perform this action?
Governance control planeDecisionsShould this change be allowed?

Agent frameworks coordinate execution. Meta-harnesses coordinate teams of agents. A governance control plane coordinates decisions — it is the layer that holds architectural intent and checks every proposed change against it. The first four layers all increase an agent’s capacity to act on a system. The fifth is what keeps that capacity aligned with what the organization decided the system should be.

Controlling Behavior, Not Just Generating Code

The two-year race to make agents more capable produced extraordinary tools. The next race is different. As AI systems become more autonomous and more numerous, the constraint on engineering quality stops being how much code you can generate and becomes how reliably you can control what that code is allowed to do.

Omnigent is a strong signal that the industry has noticed the shift — that the unit of concern is moving from the agent to the agent ecosystem, and from capability to control. The layer it points toward, but does not by itself complete, is a governance control plane: a system that turns architectural decisions into rules agents must satisfy, enforced deterministically, with a record of which rule allowed or blocked which change. Frameworks coordinate how agents run. Governance coordinates what they are allowed to ship. That second layer is where the next durable advantage in AI engineering will be built.