For most of the past year, getting an AI coding agent to respect a project’s architectural decisions has been a paste-and-pray exercise. You write a CLAUDE.md, you write Cursor rules, you maintain a system prompt and you hope the model reads them. Sometimes it works. Often the rules are ignored because they sit outside the agent’s natural workflow — static text in a corner, not a tool the agent reaches for when it needs an answer.

The Model Context Protocol changes the shape of this problem. Instead of pushing context into the model, the agent pulls. It calls tools, queries resources, asks for what it needs at the moment it needs it. That changes how a governance layer should be built. Mneme now ships an MCP server, and the experience of using it inside Claude Code is materially different from anything I had before.

“You do not give the agent the rule book. You give the agent a librarian, and the librarian knows the rule book.”

What the MCP server exposes

The Mneme MCP server is a thin layer in front of the same decision store the CLI uses. Decisions live in .mneme/ in your repository — markdown files with structured frontmatter, version-controlled like everything else. The MCP server exposes them as three kinds of MCP primitives:

Resources for raw decision documents. The agent can list them, read individual ones, and follow references between them. This is the equivalent of letting the agent open a file in .mneme/ without having to know the path.

Tools for queries that are not just file reads. mneme_search takes a natural-language query and returns the decisions that match. mneme_relevant_for takes a file path and returns the decisions that govern code in that file. mneme_check runs the same validation that CI runs and returns the diff between intended and actual state.

Prompts for canned workflows. The agent can ask Mneme for the right prompt template when it wants to record a new decision, or when it wants to draft an exception note for a violation that should be merged anyway.

None of this is exotic. The point is that it gives the agent a shape of access that matches how it actually behaves — iterative, query-driven, narrow in scope per call.

What this looks like inside Claude Code

Once the server is registered (a single entry in .mcp.json, no daemons to run), Claude Code sees Mneme as a set of tools it can call. The first time it touches a file in a session, it tends to ask Mneme what decisions are relevant. The behaviour is not prompted. It learned the pattern from the tool descriptions, which is exactly what MCP is for.

> I need to add a new endpoint that creates a Stripe subscription. Checking project decisions for relevant context... [mneme_relevant_for billing/]   ADR-014 Use the Repository pattern for any DB access from controllers.   ADR-022 Stripe webhooks must be idempotent; use stripe_event_id as the key.   ADR-027 Subscription state mirrored to internal billing_accounts table. Proceeding with these constraints applied.

The agent fetched three relevant decisions out of dozens, without any prompt about Mneme, without any paste from me. The constraints arrived at the moment they mattered. Code that came out of that session respected all three.

Why pull beats push

The thing that surprised me about MCP integration was not the speed. It was the discipline. When you push context into a model up front, you have to decide what to push without knowing what the agent will actually work on. Half the rules are irrelevant for the current task and clutter the context window. The other half are relevant but compete with the user’s actual request for attention.

When the agent pulls, it pulls what it needs. The context window stays clean. The decisions that get loaded are the ones in scope for the task. The decisions that are out of scope stay on disk — readable, version-controlled, but not consuming attention budget.

This matches the principle behind shift-left governance: enforce constraints as close to the moment of creation as possible. Pull-based access is the protocol-level expression of that principle. The rule shows up when the code is being written, not after.

How the server stays honest

Three design decisions kept the MCP layer from becoming a leaky abstraction.

Resources are deterministic. A Mneme decision is a file. Reading it through MCP returns exactly the file content, with no synthesis layer. If the agent makes a choice based on what the server returned, you can audit the choice against the same file the agent saw.

Search is grounded. The mneme_search tool runs against the local decision index, not against a model. There is no hallucinated decision. If the agent reports a constraint, that constraint exists in the repo, signed off by a human at a point in history.

Writes are flagged. The MCP server can record a new decision on the agent’s behalf, but those records arrive in a proposed/ queue, not in the active set. A human promotes them. Letting agents propose decisions is useful; letting agents make them binding is not.

Where this is heading

The interesting frontier is not making models smarter. It is making the protocol around them sharper. MCP has done more for the practical usability of governance tooling in six months than a year of prompt engineering did. I expect every serious developer-tools company to ship an MCP server in 2026 because the pattern works — agents call tools the way humans hire colleagues.

For Mneme specifically, the MCP server is now the primary way I use the system in my own work. The CLI is still there. CI still enforces. But the moment-to-moment experience is an agent that fetches its own constraints, applies them without being asked, and lets me spend my attention on the part of the work that actually wanted me.