Custom AI Agents

Part 5 — Orchestration

When do you actually need multiple agents — and what does a well-structured system look like?

6 min · Updated June 2026

It is tempting to architect everything as a swarm of specialised agents collaborating. Resist it.

Q5.1 — What is the multi-agent reality check?

Anthropic’s own published analysis is blunt: multi-agent systems use roughly 15× more tokensthan a single chat, so they only make economic sense when the task’s value is high enough to justify that, and the work is genuinely parallelisable or too large for one context window.

The decision tree:

1.Can a workflow — predefined steps — solve it? Do that. Cheapest, most reliable, most auditable.
2.If not, can a single agent with good tools and context management solve it? Do that.
3.Only if the task is genuinely parallel, exceeds a single context window, or spans many complex tool domains should you reach for multi-agent.

Most teams skip straight to step three. That is the mistake.

Q5.2 — What are the five multi-agent patterns used in production?

When you do go multi-agent, the field has converged on five recurring shapes:

Supervisor (orchestrator-workers) is the 2026 default. One orchestrator agent owns the overall task and full context; it spins up ephemeral, isolated worker sub-agents for sub-tasks, each of which returns a compressed summary. This works because it combines a single point of coherent control with clean context isolation.
Pipeline (sequential)is staged refinement, where each agent’s output feeds the next — research → screen → schedule. Predictable and easy to reason about.
Fan-out (parallel) runs independent branches simultaneously and then merges them. The hard requirement is that the branches must be genuinely independent; if they need to coordinate mid-flight, this pattern breaks.
Debate has two agents argue a question and a third judge. Surprisingly effective for hard, subjective decisions, and cheap to wire up.
Swarm uses peer-to-peer agents with shared state and no fixed hierarchy. Powerful but hard to control. Reserve it for back-office work, almost never for a customer-facing journey.

Q5.3 — What is the planner / generator / evaluator trio?

A particularly useful specialisation of the supervisor pattern for long-running tasks is to separate the agent that plans, the agent that does the work, and a separate agent that judges the result. Separating the doer from the judge measurably reduces the “graded its own homework” failure — where an agent confidently rates its own bad output as good — which matters enormously for subjective outputs like legal drafting or financial commentary.

Q5.4 — What within-agent design patterns are actually used?

Independent of how many agents you have, each agent’s internal behaviour draws on a small set of patterns from the canonical Anthropic taxonomy:

Prompt chaining— break a task into sequential steps.
Routing— classify the input, then dispatch to the right handler. This is hugely underused; many problems that get built as agents are really routing problems.
Parallelisation— split into independent subtasks, or run the same task several times and vote.
Evaluator-optimizer— generate, critique with a separate evaluator, refine in a loop. Essential for high-stakes drafting.
ReAct— the baseline reason → act → observe loop.
Reflection— add a self-critique step; raises accuracy at the cost of latency.

You compose these. A contract-review agent might route by contract type, chain through extraction → analysis → drafting, and run an evaluator-optimizer loop on the final language. None of this is exotic; it is deliberate composition of simple patterns.