When a team ships their first ADR process, the instinct is to put the ADRs in a wiki or a docs/adr/ folder and call it governance. In the human-only workflow, this works well enough — engineers read the ADRs, internalize the decisions, and apply them in code review.
In the AI-assisted workflow, this breaks down immediately. AI coding agents don't read ADRs. They generate code from prompts. If you want an agent to respect a decision, the decision needs to be in the prompt — injected as structured context before generation, not sitting in a wiki hoping the agent will find it.
But even prompt injection isn't governance. It's suggestion. The distinction between documentation, suggestion, and enforcement is the architecture of the governance problem.
The three-tier model
There are three tiers of architectural knowledge in an AI coding workflow, each with a different relationship to enforcement:
- Documentation — prose that explains context, rationale, and history. Wikis, ADR bodies, design docs. Read by humans. Retrieved (imprecisely) by RAG. No enforcement capability.
- Prompt memory — constraints injected into the model context. CLAUDE.md files, rules files, retrieved RAG passages. The model may or may not follow them. No structural enforcement.
- Decision memory — structured, schema-validated constraint records evaluated deterministically against model output. Authoritative, precedence-aware, scope-scoped. Enforcement-capable.
Most teams operating at Tier 1 believe they're at Tier 3. The most common mistake in AI coding governance is mistaking documentation retrieval for enforcement.
What documentation is and what it isn't
Documentation is the written record of why decisions were made. ADR bodies contain context (why did we face this decision?), rationale (why did we choose this option?), and consequences (what are we accepting by making this choice?). This is valuable. It's the institutional knowledge that prevents future engineers from repeating past mistakes.
What documentation is not: a machine-evaluable constraint. A prose paragraph explaining why PostgreSQL was chosen for the payments service does not tell a governance system what exactly the AI agent is forbidden from writing, which files does this apply to, or what takes precedence when this conflicts with another decision.
Those questions require structured fields, not prose. And structured fields are the definition of decision memory.
The schema difference
The structural gap between documentation and decision memory is visible at the schema level. A typical ADR file has:
A decision record for the same content has:
The documentation version says what was decided. The decision record says what the AI agent is forbidden from doing, in which files, with what precedence, superseding which earlier record. These are the fields that governance evaluation requires.
- Context (why we chose)
- Rationale (why this option)
- Consequences (what we accept)
- Status (human-readable)
- Date (when written)
- Scope pattern (which files)
- Machine-readable status
- Supersedes (conflict lineage)
- Priority tier (org/project/feature)
- Constraint text (what to enforce)
- Tags (retrieval signals)
Why precedence resolution requires decision memory
The hardest governance problem isn't enforcing a single decision — it's resolving conflicts between multiple applicable decisions. Conflicts happen constantly in real codebases:
- An org-level rule says "no direct database calls from HTTP handlers."
- A project-level decision says "this service's admin handler is an approved exception — approved by platform team on 2024-11-10."
- A feature-level decision says "the new bulk-import handler needs direct connection for performance — feature flag: bulk-import-v2."
When an AI agent edits services/payments/bulk_import.py, all three decisions are potentially applicable. Which wins?
The answer requires a precedence engine that evaluates: status (is each decision active?), scope specificity (the feature-level decision is most specific), supersedes relationships (does any decision explicitly supersede another?), and priority tier (feature > project > org for exceptions). A governance system that lacks structured fields for any of these dimensions cannot compute the answer deterministically.
Documentation retrieval leaves conflict resolution to the model. When RAG surfaces three conflicting passages, the model picks whichever it finds most persuasive in context — which is probabilistic and ungovernable. Decision memory carries the fields that make conflict resolution computable.
The enforcement gap
Even if you resolve the retrieval and conflict problems, documentation-based approaches face a fundamental enforcement gap: they cannot validate generated output.
Enforcement requires two things the documentation path cannot provide:
- A structured constraint to evaluate against. "Use PostgreSQL with SQLAlchemy ORM" is a constraint. Three paragraphs about why PostgreSQL was chosen are not. Evaluation requires a precise, machine-testable assertion — what import is forbidden, what pattern must appear, what must not appear.
- A layer that inspects output after generation. The governance system must see what the AI actually wrote, compare it against the applicable constraints, and emit a verdict before the code reaches review. This is the Evaluator layer in Mneme — separate from retrieval, separate from the LLM, architecturally downstream of both.
Documentation retrieval injects context before generation. Decision memory enforcement checks output after generation. Both are necessary for a governance system that catches violations reliably.
Converting documentation to decision memory
The path from documentation to decision memory doesn't require rewriting your ADRs. Mneme's ADR import integration reads existing ADR files and extracts a ## Constraints section that engineers add alongside (not replacing) the existing ADR structure.
Each constraint line in the Constraints section is compiled into a structured decision record in project_memory.json. The ADR body — context, rationale, consequences — remains documentation and is not imported into the enforcement corpus. The two layers coexist: documentation for human understanding, decision memory for machine enforcement.
The ADR import workflow: Add a Constraints section to any existing ADR. Run mneme import-adr --preview to see what records would be created. Run without --preview to apply. Your ADRs gain enforcement capability without losing their documentation function.
The full comparison
| Dimension | Documentation | Decision memory |
|---|---|---|
| Format | Free-form prose | Typed schema (id, scope, status, constraint…) |
| Primary consumer | Human engineers | Governance system (retriever + evaluator) |
| Retrieval method | Keyword search or semantic similarity | Deterministic field-weighted scorer |
| Scope handling | None — applies conceptually | Explicit glob patterns per record |
| Conflict resolution | Left to reader / model | Precedence engine (status, tier, supersedes) |
| Enforcement point | Suggestion via context injection | Post-generation evaluation with PASS/FAIL verdict |
| Audit trail | None — no per-decision trace | Decision ID, field match, verdict per check |
| Version control | File-level (who changed the doc) | Record-level (which record, which field, which commit) |
| Supersession | Noted in prose only | Machine-readable supersedes field — old record deactivated |
When documentation is the right tool
This isn't an argument against documentation. Documentation serves an irreplaceable function: it preserves the reasoning behind decisions in a form engineers can read, discuss, and learn from. A team with rich ADRs and architecture docs has a significant advantage over one without.
The argument is narrower: documentation is not a governance system, and treating it as one creates a false sense of safety. When a team believes their ADR wiki is enforcing architectural decisions in the AI coding workflow, they are not protected against the violations they think they are.
The right stack is both: documentation for human understanding and institutional memory, decision memory for machine-readable constraint enforcement. The two are complementary, not competing. Mneme is designed to sit alongside your documentation — it enforces the enforceable subset of your architectural decisions without asking you to stop writing ADRs.