The shift that matters
The core problem in AI-assisted software work is not “which model should we use?” That changes constantly. The real problem is: how do we turn human intent into reliable software without losing control, quality, and responsibility along the way?
Vibe coding showed that models can generate a lot of code from natural language. That is powerful for prototypes. But in real product work, delivery cost is not just writing code. It is aligning scope, preserving architecture, protecting data, validating behavior, reviewing changes, auditing decisions, and keeping the system healthy after deployment.
Two ideas need to work together:
- SDD, or Spec-Driven Development: writing and evolving specifications that act as a contract.
- Harness Design: designing the work system around the agent so it has context, tools, boundaries, validation, and governance.
Separately, these ideas help. Together, they change the level of the workflow.
What SDD is
SDD means Spec-Driven Development.
In this context, a spec is a living, versioned, reviewable artifact that answers:
- what problem we are solving;
- who it matters to;
- what is in and out of scope;
- which behaviors must exist;
- which technical, legal, or product constraints must be respected;
- which criteria prove the work is correct;
- which risks require human review.
The central shift is simple: the agent does not implement “the idea that was implicit in the chat.” It implements against a contract.
What Harness Design is
Harness Design is the design of the agent operating environment.
If SDD answers “what must be true?”, the harness answers “how do we create a system where the agent can execute, validate, and improve without relying on informal memory?”.
A model is not an agent
A model responds. An agent tries to complete a task using tools in a loop: it reads context, takes an action, observes the result, and decides the next step.
The harness is what makes that loop useful in real work. It tells the agent which rules to follow, which tools to use, where to find context, when to ask for approval, and how to prove the work is done.
agent = model + harness
What goes into a harness?
Think of the harness as the agent’s work infrastructure. It is not one tool. It is the system that lets you move from conversation to a validated delivery.
- Instructions: files like
AGENTS.md, rules, and skills that carry commands, standards, and project agreements. - Context: specs, technical decisions, architecture, product docs, and enough history so the agent is not working in the dark.
- Tools: terminal, Git, browser, GitHub, logs, MCP servers, and other integrations that let the agent act in the right environment.
- Boundaries: sandbox, permissions, human approvals, and blocks for destructive or sensitive actions.
- Validation: lint, tests, typecheck, build, diff review, and checklists that push back against broken delivery.
- Observability: UI, logs, metrics, traces, screenshots, and reproducible checks that the agent (and the human) can read to understand what happened.
- Learning: each recurring failure becomes a rule, test, hook, doc, or skill. The harness improves because the team learned.
The goal is not to trap the agent. It is to create rails. Good rails do not prevent movement; they make movement predictable, reviewable, and safe.