Why I made the knowledge base headless

I built a corporate RAG, and the first version I threw out was the one with a chat interface. The real consumer was never going to be a person — it was going to be an agent. And for an agent, the UI is noise.

This is a post about Akopia, the RAG layer that runs across our knowledge sources at RLABS, and specifically about the decision that ended up defining everything else: it has no chat, no web UI, no public REST. The only way to consume it is through its MCP server. That choice was uncomfortable to make and clarifying to live with. The code is public at gitea.rlabs.cl/rlabs-cl/akopia if you want to follow along.

The Pain: Two RAGs Coexisting in the Same Org

Almost every enterprise RAG you can buy or build is born with a chat interface, and the reason is straightforward: chat is what the buyer understands. A demo without a chat looks unfinished. A pitch without a chat looks like an API for an API. The chat is the surface that makes the product legible to the people approving the budget.

But the actual consumption pattern, once the thing is running in production, splits fast and asymmetrically. Humans ask occasionally — when they remember the tool exists, when they're stuck on something specific, when they can't find a doc by other means. Agents ask constantly, because that's how agents work: every coding session, every planning loop, every tool-using flow pings the knowledge layer dozens of times.

When agent traffic outruns human traffic 100 to 1, and that ratio has been the rough shape of what I've measured, the UI stops being an asset. It becomes ballast. It's a thing that has to be maintained, secured, redesigned every time the underlying API changes — for a consumer base that, in volume terms, barely uses it. And the worst part is that the existence of the UI obscures the question of whether the API is any good, because the UI papers over rough edges that an agent consumer would have surfaced immediately.

What Changes When You Accept the Primary Client Is an Agent

The shift, once you make it, isn't cosmetic. Almost every design choice flips. You stop designing for visual affordance — the placeholder text, the empty state, the helpful tooltip — and start designing for machine-readable schema. The tool descriptor is the user manual, and the user is a model with a context window.

The tool descriptor matters more than the input placeholder. In a UI, you can be vague in the field label because the human will figure it out from context. In an MCP tool, the descriptor is the only context. "Search the knowledge base" becomes "Search the indexed corpus across all configured sources, returning ranked passages with source URIs and relevance scores; use this when the user asks a factual question about internal documentation or code". The verbosity isn't bloat — it's the surface the agent reasons against.

The error message is what the agent receives, not what the human sees. A 500 with a stack trace is fine in a developer console. An MCP tool that returns an unstructured error blob makes the agent guess what went wrong and often guess wrong. Errors become typed, scoped, and parseable, because the consumer is going to act on them automatically rather than reading them.

And auth stops being a login screen and becomes a scoped bearer token. There's no session. There's no CSRF. There's a token attached to every call, with a scope that defines what the calling agent can see and do. That's it. The complexity that lived in the session-management layer simply disappears.

Akopia in Three Deployments and Zero Screens

What actually runs in the cluster is three things. The API concentrator is a FastAPI service that handles ingest (new documents arriving from gitea, git mirrors, NAS shares) and internal admin (reindex requests, namespace management). It's internal-only — never exposed publicly. The embeddings worker pulls jobs off a Redis stream, runs them with clamped batch size, and treats failure as non-fatal rather than blocking. The mcp-server is the only consumption surface the outside world sees, and it speaks the Model Context Protocol over the standard transport.

The stack underneath uses Qdrant for vector search, Meilisearch for lexical search, and Redis as the coordinator (more on that pairing in a separate post). Everything lives in a single k3s namespace called akopia. Auth is bearer-token with AKOPIA_STRICT_AUTH=1 set, which makes the system fail-closed: if the token is missing, malformed, or out-of-scope, the call refuses rather than degrading to anonymous access.

The architecture is intentionally adapter-shaped. There are adapters per modality (text, image, audio_transcript, video_transcript) and adapters per source (git repos, NAS shares, gitea). When we add a new source — a new file server, a new git host, a new wiki — what changes is one reader. The interface, the index, the consumption surface stay identical. That's the property that lets the system absorb new corners of the org without growing new UIs.

And the operational tell is mundane: there's no admin panel. Admin is kubectl plus curl. If you need to reindex a namespace, you call an internal endpoint with a token. If you need to inspect what's running, you read pod logs. The absence of a dashboard isn't an oversight — it's the rule the system is built around.

What You Gain by Not Building UI

The gains compound in unexpected places. Development time: by my rough accounting, somewhere around 40% of a typical enterprise RAG's engineering effort goes into the chat frontend — the conversation state, the streaming display, the source-citation rendering, the feedback widgets. That entire chunk of work stays in your pocket. The team works on the parts that actually move the retrieval quality.

Security: every surface you don't build is a surface you don't have to defend. No session means no session hijacking. No chat input means no XSS over the response stream. No web admin means no privilege escalation through a hidden form field. The attack surface shrinks to the MCP transport and the bearer-token auth, both of which are narrow enough to actually reason about.

Product honesty: this one took me a while to articulate, and it ended up being the most valuable. If you can't explain to an agent — through a JSON schema with a textual description — what your system does and how to use it, you couldn't have explained it to a human with a placeholder either. The UI was hiding the fact that the API was vague. Without the UI, the vagueness is exposed and forced to clarify itself.

And embeddability: any IDE, assistant, or pipeline that speaks MCP consumes Akopia with zero custom integration. The system slots into Claude Code, into Cline, into any custom agent harness, into a CI pipeline that wants to query the knowledge base before running tests. The integration cost on the consumer side is the cost of pointing at an MCP server. That's it.

What You Lose (and Why That's Fine)

I want to be honest about the costs, because they're real and they matter for some buyers.

Demos can't be shown by opening a URL. You need an MCP client ready. For an internal audience that already runs agents, this isn't a barrier — they have the client. For a board meeting or a sales pitch to a non-technical buyer, the absence of a clickable thing is a real friction. The "look how pretty it is" pitch doesn't exist, because there's nothing pretty to look at. The system either works in your agent loop or it doesn't, and that's a different kind of pitch.

Onboarding a non-technical stakeholder requires a human intermediary. The system doesn't sell itself to a committee. Someone has to translate "the model just answered three queries you didn't know it was running" into the form a non-technical reviewer understands. That mediation work has to happen, and it has to be planned for.

The acceptability of those costs depends entirely on who you think the real buyer is. For Akopia, the real buyer is the platform team — the people who'll wire it into their agent stack and run a hundred thousand queries a month against it. The board never touches it. The committee never touches it. Optimizing for the wrong buyer would have cost the right one.

The Transferable Principle

When your real consumer is an agent, a UI is dead weight. The principle isn't "never build UI" — it's "build for the actual consumer". And for an increasing share of internal infrastructure, the actual consumer is no longer a human in a browser.

Before you add a chat interface to your next internal tool, run the question: who's going to run tomorrow's thousand queries — a human in a browser, or an agent in a loop? If the honest answer is "the agent", then build for the agent. The UI ends up being a costume that hides how poorly defined the real API is.

And the second-order benefit, which is the one that pays the most: the discipline of having no UI keeps you honest about whether the API is any good. There's nowhere to hide a vague tool descriptor. There's nowhere to hide an unstructured error. The thing has to work as the contract it actually is — or it has to be fixed.

Which internal tool of yours has a UI mostly because "it looks more serious", even though 90% of real traffic comes from scripts or agents? Drop me a DM or reach out via the contact channels at rlabs.cl.

#AI #Engineering #Architecture #Platform #OpenSource