RLabs Lab

Investigación aplicada.

Publicaciones técnicas del laboratorio. Documentamos lo que construimos y lo que aprendemos, con datos reproducibles cuando corresponde.

LLM Pipelines

abril de 2026 · Ramón Labbé
Show the Rubric Before You Ask for the Work
Quality Contracts as a Prevention Layer in LLM Generation Pipelines
Multi-stage pipelines for large language model (LLM) generation of structured artefacts — code, documents, specifications — commonly layer a terminal validation step on top of a linear chain of generation prompts. In practice, the generating model encounters the evaluation rubric only at that terminal step, after the artefact is already written. We propose a quality_contract block emitted at the top of every generation stage, carrying the same self-challenge criteria, expected structure, and validation checks the terminal step will use. Measured over the ten built-in archetypes shipped with AgentGuard, the contract adds an average of 1,498 bytes (approximately 374 tokens) per stage response. We argue the token overhead is favourable compared to a single correction cycle on a long artefact, and that the principle — show the rubric at generation time, not only at grading time — generalises to other LLM pipelines with declared evaluation criteria.
Leer más →
abril de 2026 · Ramón Labbé
Inside AgentGuard
A Four-Stage Pipeline for Structured LLM Generation
Most public discussion of LLM-assisted code generation stops at the prompt. The engineering reality of a production pipeline is different: there are stages, a contract between stages, a session store, a validator, and a set of boundary conditions that decide whether the pipeline can be paused, resumed, or replayed. This paper documents the four-stage architecture of the AgentGuard generation pipeline — skeleton, contracts-and-wiring, logic, and validate — and explains the design decisions behind each boundary. It is a companion to the author's prior work on the mechanism, the empirical benchmark, and the enterprise argument; its role is to make the architecture legible enough that both the benchmark and the enterprise argument can be evaluated against the system they describe.
Leer más →
abril de 2026 · Ramón Labbé
Archetypes as Enterprise Primitives
Why Form Beats Function for AI-Assisted Development
Large language models are increasingly capable of meeting the functional requirements asked of them — the code works, the feature ships. Enterprise adoption of AI-assisted development, however, is not bottlenecked by function; it is bottlenecked by form: repeatability across teams, consistency across releases, auditable boundaries between generated and hand-written code, and the governance that accumulates around them. This paper argues that declarative archetypes — in the AgentGuard sense — are a minimum-viable primitive for enterprise AI development, because they convert implicit prompt-level intent into an explicit, version-controlled artefact that scales to organisational size. The argument is theoretical and draws on contract programming, specification-driven development, and software architecture. It is the closing paper of a four-part series that describes the mechanism, measures it empirically, and documents the implementing architecture.
Leer más →

Investigación aplicada.

LLM Pipelines

Show the Rubric Before You Ask for the Work

Inside AgentGuard

Archetypes as Enterprise Primitives

Theory

The Two Pillars

The Two Pillars