Same prompt. Same model.
Different answer.
Because it has your project's architectural decisions.
What you're seeing
The interactive panel above runs the same prompt — "Refactor the storage backend for scalability" — through the same model twice. The two outputs disagree because one of them is governed by the project's architectural decisions and the other is not.
Without Mneme, the model recommends what its training data suggests for a generic backend refactor: migrate JSON storage to PostgreSQL or Redis, introduce an ORM, add a migration layer. It is reasonable advice in the abstract. It also ignores three architectural decisions this codebase has already made.
With Mneme, the relevant decision records are injected into the model's context before generation:
Context injection makes the rules visible to the model. That is half the problem. The other half is enforcement — ensuring generated code that does reach disk is checked against the same decision corpus. The third panel runs mneme check --mode strict, which reads the generated diff and produces a structured verdict:
prisma not in approved list.user.service.ts.This is the difference between context injection and architectural governance. Injection makes the decisions readable. Enforcement makes generated output answer to them. Mneme HQ does both, in one tool-agnostic layer that works across Claude Code, Cursor, GitHub Copilot, Windsurf, and custom SDK agents.
Try it yourself
Install and run the same flow against your own repository:
pip install mneme && mneme init && mneme check
The full source, hook integrations for Claude Code and Cursor, the GitHub Actions enforcement gate, and the project_memory.json schema are open source under the MIT license on GitHub.
For the methodology behind the 18-scenario drift test, see the benchmark page. For how Mneme aligns with NIST CAISI, the Model Context Protocol, and AGENTS.md, see the standards landscape. For the long-form argument that governance has to live outside any one AI coding tool, see architectural governance across heterogeneous AI coding agents.
More scenarios
Each scenario walks through one verdict the demo above produces — what the AI agent is asked to do, what it would generate without Mneme, what changes when the relevant decision records are injected, and what mneme check emits afterward.
Storage decision enforcement
ADR-001 keeps the codebase on JSON storage. The model proposes extending the existing module instead of migrating to Postgres or Redis.
Read →Dependency policy enforcement
An unapproved dependency (prisma) is flagged with a structured WARN, the originating decision ID, and a tracked override path.
Repository pattern enforcement
An ADR-004 violation in user.service.ts hard-fails mneme check in CI, blocking the PR until resolved or explicitly overridden.
Frequently asked questions
What does this demo show?
mneme check, which validates generated code against those decisions and produces a structured PASS / WARN / FAIL report.What is mneme check?
Does this work with Claude Code, Cursor, and GitHub Copilot?
How is this different from CLAUDE.md or .cursor/rules?
.cursor/rules are static text files the model is asked to respect. Mneme is a structured decision store with a precedence engine and hook-level enforcement, so compliance is not probabilistic. The full breakdown is in why prompt memory fails at scale; the head-to-head comparison is at Mneme vs Cursor Rules.