Professional flow

SDD and Harness Design: professional AI-native development

advanced 24 min read

In 30 seconds

SDD turns intent into development plan. Harness Design gives agents the tools and validation gates to develop safely at enterprise level.

The shift that matters

The core problem in AI-assisted software work is not “which model should we use?” That changes constantly. The real problem is: how do we turn human intent into reliable software without losing control, quality, and responsibility along the way?

Vibe coding showed that models can generate a lot of code from natural language. That is powerful for prototypes. But in an enterprise environment, delivery cost is not just writing code. It is aligning scope, preserving architecture, protecting data, validating behavior, reviewing changes, auditing decisions, and keeping the system healthy after deployment.

Two ideas need to work together:

  • SDD, or Spec-Driven Development: writing and evolving specifications that act as an contract.
  • Harness Design: designing the work system around the agent so it has context, tools, boundaries, validation, and governance.

Separately, these ideas help. Together, they change the level of the workflow.

What SDD is

SDD means Spec-Driven Development.

In this context, a spec is a living, versioned, reviewable artifact that answers:

  • what problem we are solving;
  • who it matters to;
  • what is in and out of scope;
  • which behaviors must exist;
  • which technical, legal, or product constraints must be respected;
  • which criteria prove the work is correct;
  • which risks require human review.

The central shift is simple: the agent does not implement “the idea that was implicit in the chat.” It implements against a contract.

What Harness Design is

Harness Design is the design of the agent operating environment.

If SDD answers “what must be true?”, the harness answers “how do we create a system where the agent can execute, validate, and improve without relying on informal memory?”.

A harness can include:

  • persistent instructions such as AGENTS.md, CLAUDE.md, or .cursor/rules/;
  • versioned architecture, product, and domain documentation;
  • spec, PRD, plan, and review templates;
  • tools available to the agent, such as terminal, GitHub, browser, logs, and MCP servers;
  • permission boundaries and human approval points;
  • validation commands, CI, lint, typecheck, tests, and security checks;
  • PR standards, ownership, rollback, and audit practices;
  • learning loops that turn failures into rules, tests, or docs.

The goal is not to trap the agent. It is to create rails. Rails do not prevent movement; they make movement predictable.

Applied example: enterprise SSO

Imagine a company needs to add SSO login for enterprise customers.

Amateur flow

Someone asks the agent: “implement SSO with Google”. The agent creates an integration, adds dependencies, changes auth, passes the build, and the screen opens. It looks like progress.

But nobody answered:

  • which providers are supported;
  • how domains map to organizations;
  • what happens to existing users;
  • who can configure SSO;
  • how login is audited;
  • how to roll back if the provider fails;
  • which tests prove permissions do not leak.

That is not enterprise engineering. It is a prototype in a blazer.

SDD + Harness flow

PM writes the intent: enterprise customers need to authenticate users through the SSO provider configured for their organization.

Tech lead turns that into a spec: flows, states, permissions, boundaries, schemas, endpoints, and risks.

QA derives scenarios: user without an allowed domain, provider unavailable, expired invitation, organization without SSO, existing user with the same email.

Security defines constraints: tokens cannot go into unsafe storage; logs cannot contain secrets; callback must validate state.

Agent receives the spec, auth-module context, project rules, and validation commands.

Harness runs the gates: typecheck, lint, unit tests, integration tests, build, diff review, and security checklist.

If something fails, the team does not just ask the agent to “try again”. It asks: was an acceptance criterion missing? A test? A rule? A tool? The answer becomes an improvement to the harness.

How to start without bureaucracy

You do not need an agent laboratory before using this.

Start small:

  1. Create a simple spec template for medium-risk features.
  2. Write a short AGENTS.md with commands, architecture, and essential rules.
  3. Define the minimum bar before accepting a PR: diff read, lint, typecheck, tests, and acceptance criteria.
  4. Review the harness once per sprint or before major delivery.

The goal is not bureaucracy. The goal is reducing improvisation where improvisation is expensive.

You made it to the end

Now put it to work.

Go to the hands-on project

Community

Ask, answer, get unstuck

Use this space to ask questions about the session, share examples, and help other people understand the topic.