What Palantir’s White Paper Argues

Governing AI Agents, published by Palantir in 2026, makes a clean argument: “Mechanisms at the model level to ensure that AI Agents meet safety standards are insufficient.” Lab safety testing, Palantir writes, is not a substitute for downstream evaluation in production, because in-built model safeguards fall away once domain-specific context is introduced. Governance therefore has to move to the operational layer where agents actually act.

The paper organizes that layer into two halves. Governance controls cover authorization workflows, bounded execution, testing and evaluation, observability, and fail-safe modes. Governance workflows cover downstream evaluation, human and agentic collaboration, and AI lifecycle development. The spirit is captured in one line: “Just because Agents are powerful does not mean they should by default have access to all data or be able to take any action in any system.” And on reliability, Palantir is blunt about why the stakes differ by use case: an agent that fails 1% of the time may be fine for sales emails, but automatically shipping that code to production would be an unacceptable risk.

This is operational agent governance, in the same lineage as the enterprise platforms from IBM, ServiceNow, and others. It is a strong articulation of a real category. It is also not the whole problem for software teams.

What Palantir Names but Doesn’t Solve

The most telling sentence in the paper, for an engineering audience, is this: “Take AI Coding Agents. They can easily produce code which is effective and efficient, but the code is very difficult for human engineers to understand.” Palantir identifies the problem precisely. Then its framework stops where every operational-governance framework stops: authorization, bounded execution, evaluation, observability, fail-safes.

None of those controls answers the question a software organization actually has about a coding agent’s output. Did the change violate an architecture decision record? Did it cross a service boundary? Did it introduce a prohibited dependency? Did it contradict a decision the team made eighteen months ago? An agent can be perfectly authorized, fully observed, and bounded to exactly the systems it should touch, and still write code that quietly erodes the architecture.

Two Governance Planes

It is worth being precise about the boundary, because the two planes are easy to collapse into one word.

Agent governanceEngineering governance
What is governedWhat agents can access and doThe code changes and engineering decisions agents make
Primary concernAuthorization, safety, auditabilityArchitectural consistency, standards, intent
Source policiesOrganizational controls, responsible-AI rulesADRs, architecture rules, engineering standards
Enforcement pointDeployment and runtimeBefore generation, and in CI
Failure exampleAn agent calls a prohibited toolAn approved agent violates a service boundary

Stated plainly: agent governance controls what agents can do. Engineering governance controls what they build. Palantir’s framework is a thorough account of the first. The second is a different control plane, and it is the one that decides whether a generated change should exist.

Why Bounded Execution Isn’t Enough

Bounded execution — one of Palantir’s governance controls — restricts which systems and tools an agent may reach. It is necessary and it is not sufficient for software. A coding agent with flawless bounds, scoped to its own repository and a set of approved tools, can still produce an architecturally wrong change entirely inside those bounds. Permissions govern reach. They do not govern design. The question that matters for a codebase is not “can the agent access this repository?” but “does the change it pushed respect the architecture of the system it is changing?”

What Engineering Leaders Should Do

Read Palantir’s paper as validation, not competition. It is right that governance must move downstream to where agents operate. For software teams, “where agents operate” includes the codebase itself, which means extending the same logic one layer further.

  1. Turn architectural decisions into executable constraints. ADRs, approved dependencies, and service boundaries become machine-readable rules rather than documents an agent may never retrieve.
  2. Enforce before generation, not only at deployment. Palantir’s controls act at runtime and approval; governance before generation acts at the moment the agent chooses an implementation, where an architectural violation can be prevented.
  3. Verify with deterministic checks. Deterministic enforcement gives the same change the same verdict, with provenance back to the decision that authorized it.

Agent governance and engineering governance are complementary, and a mature enterprise will need both: one to decide whether an agent may act, the other to decide what it is allowed to build. We have argued that agent governance is splitting into two markets; Palantir’s white paper is high-authority evidence for the split, and a precise map of the half it does not cover. Agent governance controls what agents can do. Engineering governance controls what they build.