Governance gates in Claude Agent SDK workflows.
An autonomous agent is tasked with "add caching to the user lookup service." Left unchecked, it introduces Redis — a clean, reasonable move that silently kills ADR-001. With Mneme's pre-execution governance hook wrapping the SDK tool call, the violation is caught before the file write executes and the agent is rerouted to the compliant in-process JSON cache primitive.
The scenario
A single autonomous agent runs inside a Claude Agent SDK loop. Its task is scoped and reasonable: add caching to the user lookup service so repeated reads don't hit the JSON store on every call. Nothing about the task description hints at an architectural constraint. The agent reaches for the most natural tool available: Redis.
The problem is not that Redis is wrong in general. The problem is that this codebase has a binding decision against external dependencies for persistence. That decision lives in the compiled corpus — not in the agent's context window, not in a comment, not in a README. Without a governance layer, the agent has no way to know. With one, the intercept happens before any file is written.
- ADR-001 JSON storage only. No external database. No Redis. Persistence stays in-process and in-file.
- ADR-004 Repository pattern. All persistence flows through a Repository abstraction. No leakage of storage primitives into service or handler layers.
Both invariants are relevant here. ADR-001 blocks the Redis dependency outright. ADR-004 governs what the compliant replacement must look like: the cache primitive must sit behind the Repository interface, not be called directly from the service layer.
Without governance
No hook. No corpus check. The agent reasons against its training data and the task description alone. The trajectory is fast and locally coherent:
- → Agent proposes
redis-pyas the caching backend — a standard pattern. - →
str_replace_editorwritescache.pyimportingredis. - →
user_service.pyis updated to call the Redis client directly from the service layer. - →
requirements.txtgainsredis==5.0.1. - → CI passes.
ADR-001andADR-004are both silently dead.
The diff is clean. The tests pass. There is no signal in the output that two architectural decisions were just overridden without a recorded choice. The next agent to work in this codebase will see Redis as an established primitive and build on it further.
With governance: pre-execution hook
The hook wraps the SDK's tool dispatch layer. Before any str_replace_editor or write_file call executes, the proposed diff is evaluated against the compiled corpus. If the verdict is FAIL, a GovernanceViolation is raised — the tool call never proceeds, and the agent receives a structured remediation prompt.
import mneme
@mneme.governance_hook(corpus="project_memory.json")
async def before_tool_call(tool_name: str, tool_input: dict) -> dict:
if tool_name in ("str_replace_editor", "write_file"):
verdict = mneme.check(
diff=tool_input.get("new_str", ""),
corpus="project_memory.json"
)
if verdict.status == "FAIL":
raise mneme.GovernanceViolation(verdict)
return tool_input
The hook is registered once at SDK initialization. From that point on, every file-write tool call passes through governance before touching the filesystem. The agent does not need to be modified — only the tool dispatch layer.
Agent proposes Redis · blocked before write
The agent's first draft writes cache.py with a Redis client. The pre-execution hook fires on the str_replace_editor call, scores the proposed diff against the corpus, and surfaces ADR-001 as a binding conflict. The GovernanceViolation is raised before any file is touched. The agent receives a remediation prompt containing the violated ADR and the compliant alternative.
On the retry, the agent produces a compliant cache.py that extends the existing JsonUserRepository. The hook re-evaluates the new diff against the corpus and finds no conflicts.
The corpus is unchanged. The verdict and retry count are appended to the structured trace for the post-execution verification pass.
With governance: post-execution verification
After the agent completes its full loop — cache layer written, service updated, requirements unchanged — Mneme runs a post-execution check across all files touched in the session. This catches cases where a compliant individual diff still introduces a problem at the integration level: a service layer method that calls the cache directly rather than through the Repository, for example.
All invariants hold · PASS
The post-execution check scans all files written or modified during the agent session. It confirms that no external dependency was introduced, that the cache layer sits behind the Repository abstraction, and that the service layer contains no direct references to the storage primitive. The verdict is clean.
The two-phase model — intercept before write, verify after loop — closes the gap between per-call correctness and session-level correctness. Either gate can catch what the other misses.
Enforcement traces in long-running workflows
A single-turn agent session is the simplest case. The SDK pattern becomes more important — and the trace more valuable — as workflows grow in duration and scope: multi-step plans, scheduled agents, retry loops triggered by CI failures, and human-in-the-loop reviews that span hours or days.
- CI gate Enforcement at the merge boundary. The verdict trace from the agent session becomes a structured artifact in CI. If the post-execution check did not produce a clean PASS, the merge is blocked. No human needs to read the diff to catch the violation.
-
Retry loop
Structured remediation, not blind retry. When the agent retries after a
GovernanceViolation, it receives the specific ADR text and the conflict description — not a generic failure message. Retry quality improves because the agent knows exactly what constraint it violated. - Audit Decision lineage across sessions. The trace records every intercept, the verdict score, the ADRs cited, and the retry count. When a future agent or human reviews why the codebase looks the way it does, the governance layer's decision log is the answer — not a reconstructed commit history.
The key property in each case is that the governance artifact persists independently of the agent. The agent's context window ends. The SDK session closes. The trace does not. The next agent, CI run, or human reviewer picks up from a known, auditable state.
The runnable example
The repo ships a Python script that runs the full two-phase scenario against the real Mneme pipeline. The agent is simulated as a scripted diff producer — no live LLM call is required. The hook, corpus check, violation raise, remediation inject, and post-execution scan are the actual Mneme code paths.
git clone https://github.com/TheoV823/mneme
cd mneme/examples/agent-sdk-governance
python run.py
The script prints the without-governance trajectory first (Redis lands, no signal), then the with-governance trajectory (intercept fires, retry succeeds, post-exec PASS). The verdict trace is the same structured PASS / WARN / FAIL format as mneme check in CI.
Honest framing. This is a forward-looking demo. The hook mechanism and corpus check are real Mneme code. The Claude Agent SDK integration — registering the hook at SDK initialization and wiring it to the tool dispatch layer — reflects the direction the SDK's extension model is heading. The runnable example proves the governance property using the existing pipeline; the full SDK integration is on the roadmap.