What the Open Knowledge Format Actually Is

On June 12, 2026, Google Cloud published the Open Knowledge Format announcement, releasing OKF v0.1 as an open specification. The shape is deliberately plain: a directory of markdown files, each with a YAML frontmatter block carrying fields like type, title, description, resource, tags, and timestamp. No SDK. No proprietary runtime. No vendor lock-in. Any system that can read a file can read OKF.

That plainness is the point. OKF is vendor-neutral by construction, and it cleanly separates who writes knowledge from who consumes it. A platform team can author and version organizational knowledge once, and a coding agent, an enterprise assistant, a workflow agent, and a search index can all read the same source without any of them owning the format. OKF formalizes a pattern the industry had already been reaching for — the practice we have described as treating your codebase context like an LLM wiki: knowledge written down, structured, and made retrievable instead of scattered.

The Three Problems OKF Solves

OKF is a genuinely useful piece of infrastructure, and it is worth being precise about what it fixes. It solves three real problems, all of them about getting knowledge to the agent.

  • Knowledge organization. Today organizational knowledge is scattered across Confluence, Notion, Google Docs, Slack threads, and half-maintained wikis. OKF gives it a single structured representation instead of a dozen inconsistent ones.
  • Knowledge portability. Because it is plain markdown with no SDK, the same knowledge directory works across coding agents, enterprise assistants, workflow agents, and search. Write once, consume anywhere.
  • Knowledge discovery. Agents can reliably locate the relevant decision instead of relying on ever-larger prompts that stuff the whole wiki into context and hope the model notices.

The value here is standardization. A common language for organizational knowledge is exactly the kind of primitive the agentic ecosystem needs, and Google publishing it as an open spec rather than a closed product is the right move. None of what follows is a criticism of OKF. It is a description of the layer that sits above it.

The Retrieval Fallacy

Here is the assumption baked into most knowledge-for-agents architectures, OKF included: Knowledge Retrieved equals Knowledge Applied. Get the right document in front of the agent and the agent will act on it. For a chatbot answering a question, that assumption mostly holds. For a coding agent that writes and ships code, it quietly falls apart.

Consider a concrete case. A team has a stored architectural decision: all database access must go through repository abstractions — no direct SQL in application code. The decision lives in the knowledge directory. The agent is asked to add a reporting endpoint. It retrieves the decision. It references it during planning. It can quote the rule back to you. And then it writes a direct SQL query against the table, because that was faster and the query was simple.

Nothing failed. Retrieval worked perfectly — the agent found the decision, read it, and understood it. What was missing was the layer that checks the proposed change against the decision and rejects it. That missing layer is governance, and its absence is how architectural drift enters a codebase one reasonable-looking commit at a time.

This is the failure mode we keep returning to: memory is not governance, and the same logic applies to discovery. Retrieving a security requirement, a naming standard, or a framework convention tells the agent what the rule is. It does nothing to guarantee the agent obeys it. Knowing and complying are different systems with different failure modes, and a knowledge format only addresses the first.

The reason this matters more for agents than for people is throughput. A human engineer who reads the repository-abstraction rule internalizes it once and applies it across hundreds of changes. An agent re-reads the rule on every task and re-decides whether to honor it on every task, under local pressure to produce a working diff fast. Each decision is independent, which means the rule has to win every single time or the boundary erodes. A storage format that makes the rule easy to find does not change the odds at the moment the agent chooses to bypass it.

The Four-Layer Agentic Stack

It helps to place OKF in the stack that the agentic ecosystem is assembling. Each layer answers a different question, and the questions are not interchangeable.

LayerExamplesQuestion it answers
1. Knowledge storageOKF, markdown reposWhat does the organization know?
2. RetrievalRAG, vector DBs, semantic searchWhat is relevant right now?
3. PlanningAgents, orchestrationWhat should be done?
4. GovernancePolicy enforcement, decision validation, execution controlsIs this action allowed?

Most of the public conversation lives in layers one through three. OKF is a clean layer-one contribution. Vector databases and semantic search occupy layer two — though, as we have argued, retrieval is not the same as memory, and neither is the same as enforcement. Agent frameworks and orchestrators own layer three.

Layer four is becoming first-class, and the reason is structural. Agentic AI generates actions, not text. A chatbot that ignores the docs gives a mediocre answer and the cost stops there. A coding agent that ignores the docs introduces a security vulnerability, breaks an architectural boundary, creates a compliance violation, or quietly raises technical debt — and that cost ships to production. When the output is executable, "did it follow the rules" stops being a quality nicety and becomes a control requirement.

Notice that the four layers fail in different directions. Layers one through three are permissive by design: storage holds everything, retrieval surfaces what is relevant, planning proposes whatever moves the task forward. Their job is to make more knowledge usable. Layer four is the only one whose job is to say no — to reject an action that contradicts a decision the organization already made. You cannot build a permissive layer well enough to produce a restrictive guarantee. Better storage, better retrieval, and better planning all increase what the agent can do; none of them constrains what it is allowed to do. That asymmetry is why governance cannot be folded into the lower layers and has to sit on top of them as its own concern.

Memory, Retrieval, Storage, Governance: Four Different Questions

The vocabulary in this space is overloaded, so it is worth separating the terms by the question each one answers. They are not competing solutions to one problem. They are sequential answers to four different problems.

CapabilityQuestion it answers
StorageWhat do we know?
RetrievalWhat should the agent read?
MemoryWhat should the agent remember?
GovernanceWhat must the agent obey?

OKF is a storage standard. It makes the first column excellent. But better storage does not improve the last row, because obedience is a property of enforcement, not of how knowledge is filed. This is the same distinction that makes RAG insufficient for architectural governance: a retrieval system can surface the perfect constraint and still have no mechanism to stop the agent from violating it. The gap is not in what the agent can find. It is in what happens to the agent’s output when it ignores what it found.

If you are evaluating where this fits in a real workflow, the practical question is which constraints your agents must never violate, regardless of what they retrieve — the kind of thing our use cases walk through. That list is your governance layer. OKF can hold the decisions; it cannot be the thing that enforces them, and it was never designed to be.

The Next Layer Is Not Memory. It Is Governance.

Open Knowledge Format is an important common language for organizational knowledge, and standardizing how agents discover what a company knows is a real contribution to the ecosystem. But structured memory alone does not solve the hardest problem in agentic engineering. The challenge is no longer whether agents can find company knowledge. With OKF and the retrieval stack around it, that problem is largely handled.

The challenge is whether agents will follow what they find. A decision the agent retrieved, referenced, and then violated is not a discovery failure. It is a governance failure — and no improvement in how knowledge is stored or surfaced closes it. The way an enforced decision actually reaches every agent action is its own discipline, the one we call governance propagation: a constraint checked deterministically against each proposed change before that change becomes part of the system. OKF makes the knowledge legible. Governance is what makes it binding. The next layer of agentic infrastructure is not memory. It is governance.