Evidence-RAG loopexecution, atoms, retrieval, next execution execute agent + tools emit evidence atoms index retrievable retrieve next run uses past evidence agents learn from their own traces

Your cron jobs should write markdown: the automation-evidence loop

1 de junio de 2026·Implementations

I have 25 automated workflows running all day. Until six months ago, when something failed, the only way to understand what had happened was to open the n8n dashboard and stare at runs — scrolling, clicking into execution views, copy-pasting JSON into a notepad to read it. Now every workflow writes markdown that gets committed to git, the knowledge base indexes it, and an agent can answer "what happened with the payouts monitor last Tuesday?" without ever touching n8n.

The change wasn't about replacing n8n. It's still excellent at executing. The change was about not letting the operational knowledge of the system live only inside n8n's UI.

The Pain: Automation That Lives Only in Its Dashboard

n8n is great for what it does — visual workflow building, broad connector library, fast to iterate. But its UI isn't queryable by agents, and that turned out to matter more than I'd realized.

When a workflow failed, the recovery path was always the same. Open the browser. Find the workflow in the list. Scroll through recent runs. Open the execution view. Click into the failing node. Stare at JSON. The information was there. It just wasn't reachable by anything other than a human eye on a screen.

The knowledge of what this system does was imprisoned inside the dashboard. Onboarding a new collaborator meant walking them through 25 workflows individually. Asking "is this workflow healthy?" meant 25 dashboard tabs. Nothing scaled.

The real cost showed up when I wanted to correlate two workflows. Does the SSL cert renewal happen before the next F29 tax filing reminder? Did the payout monitor's failure pattern line up with a CPU spike on the same node? Those questions can't be answered inside n8n at all — the workflows are visual silos. Each lives in its own canvas, blind to the others.

And the cost compounds with team growth. When it's just me, holding the mental model of 25 workflows is annoying but possible. When a second person joins, every cross-workflow question routes through a conversation. "Hey, do you know if WF-B5 already handles X?" Twenty of those a week and you've replaced a queryable system with synchronous human bandwidth. That's the moment I knew the dashboard model wouldn't scale past one operator.

The Loop: Workflow -> Markdown -> Git -> KB -> Agent

The loop I ended up with has five hops and one feedback edge.

Each workflow ends with a node that writes a structured markdown report. That's the only addition to the workflow itself — one extra node.

The report gets committed to a dedicated repo called n8n-reports, separate from the workflow definitions themselves. Separation matters: the workflow repo evolves slowly (you change workflows occasionally), the reports repo evolves constantly (every execution).

Akopia, the headless knowledge layer, indexes n8n-reports at the same cadence as everything else in the ecosystem. Every commit goes through embeddings and BM25 indexing. The reports become first-class queryable content.

Any agent with access to mcp__akopia__search_* can now ask about state, history, anomalies — without ever opening a dashboard. The dashboard becomes optional. The repo becomes the source of truth.

Anatomy of the Markdown Report

The shape of the report is what makes the indexing useful. Each report has:

  • Front matter with workflow_id, run_id, start_ts, status, duration_ms — the structured fields agents query against
  • An "Inputs" section describing what triggered the run: cron, webhook, manual trigger
  • An "Outputs" section listing what was produced, detected, or modified — in bullets, not raw JSON
  • An "Anomalies" section describing anything off, in short prose
  • A "Next check" section noting when this workflow runs again and what would change the conditions

The naming convention is rigid on purpose: reports/WF-{group}{n}/{YYYY-MM-DD}/{HHMM}.md. That gives you filter-by-workflow, filter-by-date, filter-by-time with nothing more than a directory walk. The agent doesn't need a database — the filesystem is the database.

Why evidence beats memorytwo retrieval strategies CHAT MEMORY lossy unscoped drifts no attribution no rubric a vibe of past runs EVIDENCE-RAG typed atoms scoped queries attributed rubric-aware cost-aware a record of past runs memory remembers; evidence proves

The 6 Workflow Groups in Production

The 25 workflows fall into 6 groups, and the grouping turned out to matter more for queries than I expected.

  • Accounting (A1-A5): hourly guardian on the books, USD/CLP FX rate, F29 tax filing reminders
  • Infra (B1-B7): 5-minute health checks, k3s pod monitoring, SSL expiration tracking
  • Marketplace (C1-C4): failed payouts monitor, KB platform queue health
  • Plus three smaller groups handling integrations and notifications

The naming convention WF-{group}{n} is preserved end-to-end. The same identifier shows up in the workflow definition, the reports directory, the queries the agent runs, the alerts that fire. There's no translation layer. WF-B3 means the same thing in the dashboard as in the agent prompt as in the markdown file. That consistency removed an entire class of confusion.

Queries the Loop Makes Possible

The queries are where the loop pays off. A short sample of what becomes natural:

What workflows failed last week and on what pattern? The agent reads reports, groups by status, summarizes failure modes. Two minutes instead of an hour.

When was the last successful run of workflow X? Straight from the front matter. The agent doesn't even need the body.

Is there a correlation between failures of workflow Y and CPU spikes in pod Z? The agent crosses the WF-B reports with metrics from elsewhere. The cross-source query only works because both sides live in indexable text.

What changed since the last successful report? git diff of the markdowns, readable by both humans and agents. No special tooling. Git already does this.

None of these queries existed when the data was inside n8n. They became possible the moment the data became text in git.

The most useful query category, the one I didn't anticipate, is "explain this workflow to me as if I'd never seen it." An agent reads the last week of reports and produces a plain-English summary of what the workflow actually does in production — not what the YAML says it does, what it does. The two diverge more often than the workflow author thinks. That gap is where the bugs live, and it's invisible until something forces it into text.

What It Costs and Why It's Worth It

The per-workflow overhead is one extra node and roughly 200ms of additional execution. The infra cost is zero — git and the KB already existed for other reasons. There's no new service, no new dashboard, no new vendor.

The storage cost is about 3MB per month of compressed markdown across all 25 workflows. Trivial.

The non-monetary cost is discipline. Workflows that don't write structured markdown stay blind to agents. There's no automatic enforcement — you have to remember to add the node. That's the friction. I accept it because the alternative is rebuilding visibility every time something breaks.

What you gain is that the n8n dashboard becomes optional. You only open it when you want to edit a workflow. For operating, monitoring, debugging, onboarding, correlating — the repo is enough.

What Didn't Work at First

The current shape is the third version. The first two failed in instructive ways.

First version: each workflow wrote JSON. The reasoning was that JSON is structured and easy to parse. The result was that the reports were indexable but unreadable by a human who opened the file. The asymmetry mattered — debugging required humans, indexing required machines, and one format couldn't serve both.

Second version: free-form markdown. The reasoning was that markdown is readable. The result was that every workflow had a slightly different format. Queries became inconsistent — "what was the start time?" depended on which workflow you were looking at. The agent had to learn 25 schemas.

Third version, the one running now: markdown with a fixed schema, mandatory front matter, free prose allowed only in the "Anomalies" section. Structured where queries happen, free-form where humans need to write specifically about what went wrong. The hybrid is what made it scale.

The lesson: queryability requires format discipline, not just medium discipline. Markdown alone doesn't help. Markdown with a schema does.

A second lesson came from the front matter specifically. The first schemas I tried used English keys (workflow_name, started_at). Switching to short stable identifiers (workflow_id, start_ts) saved real query time later, because the agent could match on exact tokens without dealing with synonyms. Small naming decisions early either accelerate or punish you for years.

Transferable Principle

Automation that doesn't leave a queryable trace is automation you have to re-run to understand. Every time you want to know what it did, you trigger it again, or you stare at a UI, or you ask the human who originally built it.

Your cron jobs should write markdown — not because markdown is sacred, but because it's the format both humans and agents can consume without a custom parser. Git already gives you versioning, diffs, history, and access control for free. The KB already indexes git. The cost of plugging in is one node.

The day an agent can answer "what did this system do last week" without opening a dashboard, you'll understand why this loop matters. Until then it sounds like overhead.

How many of your automated systems have their knowledge imprisoned in the vendor's dashboard? Which would be the first one you'd free to markdown? Send me a DM or reach out via the contact channels at rlabs.cl.

#n8n #MetaSoftware #DevOps #Platform #Engineering

Escríbenos por WhatsApp