Definition

Artifact provenance is the record of plans, diffs, screenshots, browser recordings, test outputs, and verification evidence produced by autonomous agents during a task.

What agent artifacts prove

Modern agent-first IDEs (Google Antigravity is the cleanest example today) collect a structured trail per task — the plan the agent generated, the commands it ran, the files it touched, the browser interactions it performed, the verification outputs it produced. That trail is genuinely useful for human review and debugging.

Artifacts make a previously opaque process inspectable.

What artifacts do not prove

What an artifact stream cannot establish, on its own:

  • That the agent respected the architectural decisions that govern this codebase
  • That the agent did not introduce a forbidden dependency or pattern
  • That the agent stayed inside the scope it was supposed to touch
  • That the agent’s remediation loop did not itself violate another invariant
  • That a parallel agent in another workspace made compatible choices

These are policy questions. They require a separate enforcement layer.

Why provenance needs policy

Provenance and policy are different jobs:

  • Provenance tells you what happened. It is a record.
  • Policy tells you what is allowed. It is a constraint.

You can read a perfect provenance trail and still not know whether the work belongs in the system. You can have a perfect policy layer and still not be able to debug why a particular run went sideways. The two compose.

A screenshot can prove the agent opened the browser. It cannot prove the agent respected your architecture.

How governance results become part of the artifact trail

The robust pattern is for the policy layer’s verdict to become a first-class artifact in its own right. The governance check runs against the proposed change, returns a structured verdict — PASS, WARN, FAIL with a provenance trace back to the originating ADR — and that verdict joins the rest of the agent’s trail.

The reviewer then sees, side by side: what the agent did, and whether the agent was allowed to do it.

The Mneme stack

This concept sits inside a broader chain Mneme is building:

retrieval freshness provenance governance verification

Each layer feeds the next. Retrieval surfaces the relevant decisions. Freshness keeps them current. Provenance records what the agent did with them. Governance constrains what the agent is allowed to do. Verification proves the constraint held.

Related concepts