Skip to main content

The Arbiter Method

A framework for building software when AI does the typing.

The Arbiter Method is a set of principles for directing AI agents to build software. It assumes you are the decision-maker and the AI is the implementer. Your job is to decide what gets built, how to verify it, and when to ship it.

Principle 1: Always Ask the AI First

Your default posture is delegation. Before you write code, before you research, before you draft: externalize the task to an agent. Your job is to decide, not to type.

This is leverage. Every minute you spend typing is a minute you're not spending evaluating, directing, or running a parallel workstream. The agent might not get it right on the first try. That's fine. Getting a draft in thirty seconds that you spend two minutes correcting is still faster than writing it yourself in five.

The hard part isn't delegation. The hard part is overcoming the instinct to "just do it yourself." That instinct was correct when typing was the bottleneck. It's not anymore.

Principle 2: Work in Extreme Parallelism

One agent is a bottleneck. Many agents are a system. Spin up multiple workspaces. Run competing approaches simultaneously. Compare outputs.

This is where werk comes in. Each agent gets its own isolated workspace: its own branch, its own filesystem, its own shell. Agent A can try approach A while Agent B tries approach B. You evaluate both. You pick the better one. You discard the other.

The cost of exploring a bad approach is near-zero when the workspace is disposable. The cost of not exploring an alternative is the approach you never saw.

Principle 3: Make the Smallest Change Possible

Every change is a potential break. Reduce blast radius. Break work into minimal deltas. Verify thoroughly. Ship frequently.

This is the discipline layer. Agents will happily produce sprawling changes that touch forty files. Your job is to constrain: "Change only the authentication module." "Don't modify the tests yet." "One function at a time."

Small changes are easy to verify. Easy to revert. Easy to understand. The hygiene standard: tests pass, lint clean, types check, build succeeds. Every single time. No exceptions.

The Loop

Delegate narrowly → Verify aggressively → Merge small → Repeat across parallel lanes.

That's the method. It's not complicated. The hard part is the discipline to follow it when you're tempted to let an agent "just refactor the whole thing." Resist. Constrain. Verify. Ship.