Interactive demo

Same prompt. Same model.
Different answer.

Because it has your project's architectural decisions.

By Theo Valmis · ·

Without Mneme
With Mneme

What you're seeing

The interactive panel above runs the same prompt — "Refactor the storage backend for scalability" — through the same model twice. The two outputs disagree because one of them is governed by the project's architectural decisions and the other is not.

Without Mneme, the model recommends what its training data suggests for a generic backend refactor: migrate JSON storage to PostgreSQL or Redis, introduce an ORM, add a migration layer. It is reasonable advice in the abstract. It also ignores three architectural decisions this codebase has already made.

With Mneme, the relevant decision records are injected into the model's context before generation:

Refactor the storage backend for scalability.
Without Mneme
Recommends Postgres or Redis. Recommends introducing an ORM. Recommends a migration layer. Ignores ADR-001, ADR-003, ADR-005.
With Mneme
Per ADR-001 (JSON storage only — no external DB), ADR-003 (no ORM in v1), and ADR-005 (extend before rebuild), the model proposes extending the existing JSON storage module instead of replacing it. Same prompt, same model, different answer.

Context injection makes the rules visible to the model. That is half the problem. The other half is enforcement — ensuring generated code that does reach disk is checked against the same decision corpus. The third panel runs mneme check --mode strict, which reads the generated diff and produces a structured verdict:

mneme check · sample output
PASS
Storage decision enforced — JSON only, no new databases.
PASS
Auth pattern respected — JWT middleware unchanged.
WARN
New dependency introducedprisma not in approved list.
FAIL
Violates ADR-004 — Repository pattern bypassed in user.service.ts.

This is the difference between context injection and architectural governance. Injection makes the decisions readable. Enforcement makes generated output answer to them. Mneme HQ does both, in one tool-agnostic layer that works across Claude Code, Cursor, GitHub Copilot, Windsurf, and custom SDK agents.

Try it yourself

Install and run the same flow against your own repository:

pip install mneme && mneme init && mneme check

The full source, hook integrations for Claude Code and Cursor, the GitHub Actions enforcement gate, and the project_memory.json schema are open source under the MIT license on GitHub.

For the methodology behind the 18-scenario drift test, see the benchmark page. For how Mneme aligns with NIST CAISI, the Model Context Protocol, and AGENTS.md, see the standards landscape. For the long-form argument that governance has to live outside any one AI coding tool, see architectural governance across heterogeneous AI coding agents.

More scenarios

Each scenario walks through one verdict the demo above produces — what the AI agent is asked to do, what it would generate without Mneme, what changes when the relevant decision records are injected, and what mneme check emits afterward.

Frequently asked questions

What does this demo show?
The same prompt sent to the same model produces architecturally compliant output when Mneme injects the project's decision records before generation. The third panel demonstrates mneme check, which validates generated code against those decisions and produces a structured PASS / WARN / FAIL report.
What is mneme check?
A CLI command that validates a code change against the project's decision corpus. It runs as a pre-commit hook, in CI, or interactively. Output is structured — PASS, WARN, or FAIL with decision IDs — so it can gate pull requests, feed dashboards, or block deploys. See the GitHub Actions integration for the CI pattern.
Does this work with Claude Code, Cursor, and GitHub Copilot?
Yes. Mneme is tool-agnostic. Hooks for Claude Code intercept Edit, Write, and MultiEdit calls; Cursor reads the same decision corpus through standard rules export; Copilot and other agents are governed via the post-generation enforcement step. See the integrations index and the heterogeneous-agents article for the full design rationale.
How is this different from CLAUDE.md or .cursor/rules?
CLAUDE.md and .cursor/rules are static text files the model is asked to respect. Mneme is a structured decision store with a precedence engine and hook-level enforcement, so compliance is not probabilistic. The full breakdown is in why prompt memory fails at scale; the head-to-head comparison is at Mneme vs Cursor Rules.