What changes when MCP is the only way in

I decided Akopia is consumed via MCP only. No public REST, no UI, no ad-hoc admin panel. What happened next is what I didn't expect: the system became radically more honest about what it actually did.

This follows on from the broader Akopia overview, but it's worth pulling out as its own piece because the rule had effects far beyond the architectural diagram. The public repo at gitea.rlabs.cl/rlabs-cl/akopia shows the shape of the result. What this post is about is the discipline that produced the shape, and the things I had to give up — and unlearn — to keep it.

The Rule: One Consumption Surface

The rule is one sentence and it's been load-bearing for the last year of work. Every capability exposed to the outside world goes through the MCP server. Not most. Not the public ones. All of them. The MCP server is the entire consumption surface.

The API hub still exists. It handles ingest from gitea, from git mirrors, from NAS shares, and it exposes internal observability hooks for the platform team. But it is not public, not in the network-accessibility sense and not in the user-facing sense. No external caller has a credential that reaches it. The choice is deliberate: the ingest path is operational, the consumption path is product, and conflating them creates more attack surface than convenience.

No "convenience" REST endpoints. No admin panel. No public CLI that wraps the API. The rule that made this stick is itself one sentence: if a capability doesn't deserve to be an MCP tool, it probably doesn't deserve to exist. The rule isn't dogmatic — it's diagnostic. Anything that fails the test should be reexamined, not exempted.

What That Rule Forces You to Do

The forcing function turns out to operate at every level of the design, in ways that prompts and READMEs don't.

Every tool needs a self-explanatory name. Not process_request. Not handle_query. Not do_thing. Names like search_semantic, search_lexical, reindex_namespace — names that an agent can read out of a tool list and immediately know whether to invoke. The brutality of that constraint is that it surfaces every place where you'd been hiding behind a generic name because you hadn't done the work of figuring out what the tool actually did.

Every parameter needs an agent-readable description, and "agent-readable" is a stricter bar than "human-readable" because the agent has no other context. The description has to carry the semantics that, in a UI, would have been carried by the surrounding screen. "The namespace of the index to delete" doesn't have a tooltip explaining what a namespace is. It has to tell you, in the same field. The cost of writing those descriptions is the same cost you should have been paying for the UI labels but probably weren't.

Destructive tools must declare themselves explicitly — destructive: true as a flag in the tool definition. This pattern is inherited directly from AgentGuard's ADR 0006, where it solved the same problem one layer up: agents need to know what they're about to do before they do it, and the only safe way to communicate that is with a typed marker the calling runtime can act on automatically.

For example:

{
  "name": "delete_index",
  "description": "Permanently delete a vector index and all associated documents. This operation cannot be undone.",
  "destructive": true,
  "parameters": {
    "index_name": {
      "type": "string",
      "description": "The namespace of the index to delete"
    }
  }
}

And structurally, you stop being able to hide mutable state behind a screen flow. There's no wizard. There's no multi-step form where step three depends on step two. Every invocation is an explicit input producing a deterministic output, which means every interaction is auditable by construction — the input is logged, the output is logged, there's no "and then the user clicked through" that hides the intermediate state.

The Honesty Test I Didn't Expect

The surprise wasn't that the rule produced clean tools. The surprise was that it produced cleaner thinking, in ways I hadn't planned for.

When a tool couldn't be described in one line, I discovered it was mixing two responsibilities. This kept happening. I'd start writing the description, get to "and also", and realize the tool was doing two jobs. The agent-readable format wouldn't let me hide the overlap behind "flexible" or "configurable". Either split it or rewrite the description to acknowledge the overlap, and the second option always felt embarrassing enough that the first one won.

When a parameter needed "see external documentation", I discovered the contract was broken. A self-contained tool description shouldn't depend on a doc the agent has to go fetch. Every time I caught myself reaching for that escape hatch, the right move was to either inline the missing context into the description or simplify the parameter so the context wasn't needed. The escape hatch became a flag for design debt.

Three tools I originally had collapsed into one plus a mode parameter, because MCP forced me to see they were variations of the same thing. The reverse happened twice, too — I had to split a tool in two because the agent kept invoking it wrong, and the failure pattern made clear that the single-tool design was hiding two distinct contracts behind one name. The medium gives feedback you can't argue with: if the agent can't use the tool right, the tool is wrong, and you can see it directly in the call logs.

What Changes in How the Team Operates

The operational consequences extend past the design phase into how the team works day-to-day.

Debugging shifts. You can't open "the dashboard" to see what's happening because there's no dashboard. You read structured logs and you invoke tools yourself — the same tools the agents are using, with the same arguments, against the same backend. The investigation tool and the production tool are the same tool. Bugs that depend on weird state are reproducible, because the state is captured in the tool inputs.

Onboarding shrinks. New devs don't get pointed at API docs and a UI tour. They get pointed at the list of MCP tools. The list IS the documentation, and it's current by construction — because if a tool ships, its descriptor ships with it, and if the descriptor is wrong, the agent's behavior reveals the lie immediately. The doc-drift problem largely dissolves.

Support conversations get sharper. When an internal user reports "X doesn't work", the very first question is "which tool did you invoke, with what arguments, what did it return". The conversation skips the entire layer of "I clicked the button and it didn't do anything" because there's no button. The answer is always structured, always reproducible, always actionable.

The Three Times I Broke the Rule and Regretted It

This is the section I want to be most honest about, because the regrets are universal patterns and I suspect anyone running infrastructure has at least one of these in their system right now.

The "emergency" REST endpoint for deleting indexes. Built one afternoon under deadline pressure because the MCP path didn't yet expose deletion and I needed it for a cleanup. "Just temporary, I'll remove it next week." Used three times. Stayed live for two years. Eventually surfaced as a security finding because it had a weaker auth path than the main surface and nobody had been monitoring it. The half-life of "temporary" infrastructure is measured in years, not weeks.

The "temporary" admin panel for health monitoring. Same story, different shape. Built because the MCP health tool was incomplete and I wanted a quick visual. Within a month, the admin panel and the MCP health tool reported subtly different numbers, because the queries that powered each had drifted apart. The team started asking which one was "right". Neither was, in the sense that no one had decided. The drift cost more than the original incompleteness would have.

The CLI with hidden debug flags. The worst of the three because it left no trace. The CLI had a --internal-mode flag that switched it to a different code path I used for development. Nobody else on the team knew it existed. When I was on vacation and something broke in a way that needed that flag to diagnose, no one could figure out how to reproduce my workflow. The knowledge was tribal of the worst kind — held by exactly one person, with no surface that revealed its existence.

Every one of those exceptions was a violation of the one-surface rule, justified at the time by short-term convenience, paid back at length over years.

Transferable Principle

The principle generalizes past MCP, past Akopia, past this particular technology cycle.

One consumption surface forces you to be honest about what the system actually does. When you have one door, you can't hide behind "well, the chat hides this complexity" or "the admin panel handles this case". The single surface has to express the truth, and the truth is what gets debugged, audited, and reasoned about.

If you find yourself needing "convenience" endpoints outside the main surface, treat that as a signal — either the surface isn't expressive enough and you should fix it there, or the convenience hides functionality that you've quietly chosen not to audit. Both diagnoses are uncomfortable, and both are better to surface than to leave festering.

MCP is the relevant example today, but the principle predates MCP and will outlast it. Having one door isn't restriction — it's architectural discipline. REST-only APIs in the early 2010s had the same property. Single-CLI tooling in earlier eras had it. The shape of the door changes every five years. The discipline doesn't.

What's the "temporary" endpoint or the script with hidden flags still alive in your system because "it's easier this way"? Send me a DM or reach out via the contact channels at rlabs.cl — I suspect it's a universal pattern.

#AI #Engineering #DeveloperExperience #Architecture #Platform