Most of the conversation about "AI coding governance" in 2026 is still about the wrong layer. Prompt rules. CLAUDE.md. .cursor/rules. RAG over an ADR folder. A reviewer agent on PRs. Policy docs in a wiki. Every one of these answers the question "how do we tell the model what we want?" — and not one of them answers the prior question: "when two of the things we want disagree, which one wins?"
That second question is not a corner case. It is the central question of any governance system that has to operate at the scale of a real codebase, with real exceptions, written by real teams over real years. And it is the question almost every tool in the category is currently dodging.
The missing layer has a name. It is precedence semantics. It is what makes governance deterministic, reviewable, and durable. Without it, "AI coding governance" is just a polite name for whichever rule the model happened to attend to this morning.
The collision nobody owns
Consider a perfectly ordinary situation. A team's data layer is governed by ADR-014: "All persistent data access goes through the repository pattern. No service may call an ORM session directly." The decision is sound. It has been the law of the codebase for a year. Every reviewer can recite it.
Eight months later, the payments team writes ADR-022: "The payments service may invoke the Stripe SDK directly inside an idempotency-key boundary. The repository abstraction does not compose with Stripe's at-most-once semantics, and the ledger requires raw call ordering." Also sound. Also reviewed. Also accepted.
These two decisions overlap. Specifically, they overlap on the question "how does the payments service write a charge to durable storage?" Both apply. They disagree.
Now four things happen at once:
- An engineer in Cursor opens the payments service and writes a charge handler. Cursor reads
.cursor/rules, sees ADR-014 because it was indexed first, and proposes a repository-based implementation. - An engineer in Claude Code, with a different CLAUDE.md, sees ADR-022 highlighted and writes a direct-SDK implementation. Both diffs are reviewed in isolation; both get approved.
- An async PR bot built on the Agent SDK refactors an adjacent file and rewrites the call site to obey ADR-014, because that is what its system prompt encodes.
- Six months later, a new engineer reads the codebase, sees both patterns in production, and asks the obvious question: "which one are we supposed to follow?" Nobody can answer without re-reading both ADRs and re-litigating the conflict from scratch.
None of those four people are misbehaving. The codebase has two correct decisions and no mechanism to say which one wins where. That is the precedence problem, and it is exactly the problem that no prompt rule, no retrieval system, and no review process will ever resolve, because none of them are designed to resolve anything — they are designed to retrieve.
Why current systems can't resolve it
It is tempting to think that the conflict in the example above is a content problem — that if the team had written ADR-022 more carefully, or pasted it higher in CLAUDE.md, the right answer would emerge. It is not a content problem. It is a structural one. Every governance substrate currently in widespread use is fundamentally a retrieval substrate, and retrieval has no opinion on conflict.
.cursor/rules, .github/copilot-instructions.md all hand the model a block of text. When two rules disagree, the model picks one based on whichever it attended to more strongly under this temperature, this context length, this surrounding code. Reorder the file and the answer changes. That is not governance — it is a coin flip with extra steps.The common failure underneath all four is the same: none of them have an opinion on which rule applies when several do. They surface rules. They do not resolve between rules. The category has confused "the model has read the constraint" with "the constraint has been applied," and those two things are not the same thing.
What precedence semantics actually is
Precedence semantics is the small body of rules a governance system uses to answer one question, every time, the same way: "given the current task, the current file, the current scope, and the full set of architectural decisions, which decision actually applies here, and which loses?"
The answer has to be deterministic. Not "the model probably picks the right one." Not "the reviewer usually catches it." Deterministic, in the engineering sense: same inputs in, same answer out, every time, regardless of which agent or model or temperature is on the other end.
It also has to be reviewable. The resolution has to be explainable as a chain of named facts — this scope was more specific, that decision supersedes the earlier one, this status retired the override — not as an opaque embedding score. A reviewer has to be able to read the resolution and either agree with it or, when it is wrong, point at the rule that produced the wrong answer and propose a fix.
Governance is not memory retrieval. Governance is deterministic conflict resolution over architectural constraints. Retrieval is one input to that resolution — not a substitute for it.
The five axes of resolution
A useful precedence engine resolves over a small, finite set of axes. Five of them carry almost all the weight in real codebases. None of them is novel on its own — what is missing in the AI-coding-governance category is the recognition that all five have to compose, and that the order in which they compose has to be explicit.
| Axis | What it answers | Example |
|---|---|---|
| Status | Is the decision in force at all? | A deprecated ADR loses to any accepted ADR that touches the same scope, even if the deprecated one was more specific. |
| Supersedes | Does this decision explicitly retire an older one? | ADR-031 carries supersedes: ADR-014. Wherever they overlap, ADR-014 is treated as if it were deprecated. |
| Scope specificity | Whose scope is narrower? | ADR-022 (services/payments/**) beats ADR-014 (services/**) inside its narrower scope, by the same logic that makes a per-route override beat a global config. |
| Priority | When scopes are equal, who is authoritative? | A security-class ADR (priority: critical) wins over an ergonomics-class ADR (priority: normal) at the same scope. Priorities are declared, not inferred. |
| Temporal | If everything else ties, the newer decision wins. | Two equal-priority, equal-scope, neither-supersedes-the-other decisions resolve by acceptance date. This axis is the tiebreaker, not the primary driver. |
Two things matter about that table more than the contents of any one row. First: the axes are evaluated in a declared order, not improvised per query. Second: each axis is a fact carried by the decision itself, not a heuristic about the decision. status, supersedes, scope, priority, accepted_at — these are properties an ADR declares, not inferences a model has to make. Once declared, resolution is a finite computation, not a guess.
This is why precedence semantics is the missing layer. Every term in the table is already familiar to anyone who has written ADRs for more than a year. What is new is the claim that resolving over them is a system responsibility, not a reviewer's job to do in their head.
Governance is deterministic conflict resolution
Once precedence semantics is named, the whole category reframes.
The retrieval framing makes the model the conflict-resolution engine. That is exactly the wrong place to put it. Models are excellent at code synthesis under a constraint and unreliable at choosing between constraints. The precedence framing keeps the model out of the resolution and lets it do what it is good at — generating code that conforms to a constraint that has already been chosen.
Said differently: the architectural truth of a codebase is not allowed to depend on which agent ran the query. If it does, the codebase does not have an architecture — it has a sample.
Governance is a compiler problem
The frame that follows from all of this is that an AI-era governance system is shaped like a compiler.
The input is the same kind of thing teams already write: ADRs, design documents, exceptions, and the relationships between them. The output is a single, queryable, scope-aware representation that every agent — interactive, async, in CI — can ask the same question of and get the same answer. Between the two is a pipeline: normalize, resolve, compile, enforce, trace.
Each stage in that pipeline already has a recognizable analog in software engineering. Normalization is what a parser does. Resolution is what a type checker or a constraint solver does. Compilation is what an intermediate representation is for. Enforcement is what a pre-commit hook or a CI gate is for. Traceability is what build provenance is for. What is new in 2026 is that all five together describe a layer above the agent — not a feature inside any single one.
This is what makes the governance-as-compiler framing useful, and not just a metaphor. Compilers are deterministic by construction. Their outputs are reproducible. Their decisions are explainable. Their failures are localizable. Every property AI coding governance currently lacks is a property a compiler-shaped governance layer would have by default.
What this changes for engineering leaders
For an engineering leader looking at the category in 2026, the practical implication is that the question to ask of any "AI governance" pitch is no longer "does it read our rules?" It is: "what does it do when two of our rules disagree?"
If the answer is "the model figures it out," the system is a retrieval layer with a governance label on it. If the answer involves embeddings, ranking scores, or "we tune the prompt," the system is doing statistics, not governance. If the answer involves a declared resolution order over status, supersedes, scope, priority, and time, the system is doing the actual job.
That distinction matters more than the surface-level tool category. A team can run Claude Code, Cursor, Copilot, Windsurf, and three in-house SDK agents on the same codebase — as most engineering orgs already do — and still have a coherent architecture, but only if the layer underneath has a deterministic answer to "which decision applies here." Otherwise the codebase ends up as a sample of whichever agent ran last.
The category is moving up the stack. "AI coding tools" is becoming a portfolio decision. "How those tools resolve architectural conflict" is becoming the infrastructure decision underneath it. Precedence semantics is the name for that infrastructure layer.
The engineering teams that win the next cycle will not be the ones that picked the best assistant. They will be the ones whose architecture survived having a portfolio of assistants — because the layer that decided what the rules were did not live inside any of them.