Academy

How to design AI agent teams

What is an orchestrator, what do report-to / shared / parallel mean, what does each parameter do? This guide is grounded in the latest Claude subagent architecture — learn it here before using the editor.

Open the editor Build a team with AI

Foundations

🧠

Claude architecture: orchestrator + subagents

A 'chief' (orchestrator) delegates to subagents; each subagent runs in its own isolated context window.

In Claude Code a skill is made of the orchestrator (chief) that runs the main conversation, plus the subagents it calls on demand.

The orchestrator is the main Claude talking to the user. It splits the task and delegates to the right subagent.
Each subagent is a separate Markdown file: .claude/agents/<name>.md. It has its own system prompt, its own tool access, and its own permissions.
The critical point: each subagent runs in its own context window. The search/log/file noise it produces never floods the main conversation — the subagent only returns its summary. This is the real way you keep context clean.

The orchestrator decides which subagent to call, and when, by reading the subagent's description. So the description isn't marketing — it's a trigger.

In this editor the master node is the orchestrator and a sub node is one subagent file.

🧭

Two axes: AGENT and WORKFLOW pattern

Nodes are either real agents (master/sub) or workflow containers that orchestrate children (sequential/parallel/loop).

Every node on the canvas falls into one of two groups:

1) Real agents — each maps to a file/persona:

master → orchestrator (chief)
sub → specialist subagent

2) Workflow containers — not files; they define how the agents placed inside them run:

sequential → in order (handing off output)
parallel → at the same time, isolated
loop → generate → evaluate → refine

Agents you drop inside a workflow container don't connect to each other — the container defines the pattern. Manual connections exist only at the outer level (chief → container → synthesis).

Agent types

👑

Master (orchestrator / chief)

Splits the task into subtasks and delegates; gathers and synthesizes results.

Master is the brain that takes the user's request, plans it, and hands work to subagents. In Claude Code this is the main conversation.

Its jobs:

Clarify the request, split it into parts.
Delegate each part to the right subagent (by its description).
Merge the summaries returned by subagents and present them to the user.

A good master prompt says when to call which subagent and how to combine outputs. A skill typically has exactly one master.

🛠️

Sub (specialist subagent)

Specialist at one job; a .claude/agents file with its own prompt, tools, and model.

Sub is the worker agent specialized in a narrow area (e.g. code reviewer, researcher, test writer). Each is generated as .claude/agents/<name>.md.

Key facts:

It runs in its own context window → it doesn't pollute the main conversation.
It can only use the tools you grant (e.g. a read-only reviewer).
When done it returns a summary to the orchestrator; intermediate noise doesn't come back.
Subagents don't talk to each other directly. Coordination happens via the chief or via a shared file.

The most valuable field is description: that's how the chief knows when to delegate to this agent.

Workflow patterns

📝

Sequential (ordered / pipeline)

Steps run in order; each step's output is handed off to the next.

A sequential container runs its agents in a specific order, handing off each one's output to the next.

When: steps depend on each other — e.g. draft → edit → format. One step can't start before the previous finishes.

This is a pipeline. You don't wire the inner agents manually; the order defines the flow.

⚡

Parallel (concurrent)

Agents run at the same time, isolated; then converge at a synthesis point.

A parallel container launches its agents at the same time and isolated from each other. They don't talk; each does its own work and returns a result.

When: the jobs are independent — e.g. research 3 different sources at once. Parallelism cuts total time.

Results converge in a merge/synthesis agent (the chief or a separate sub). The flow is: chief → parallel container → synthesis.

Tip: If you don't know the count in advance, use dynamic fan-out (below). Like "one agent per book chapter".

🔁

Loop (generate–evaluate–refine)

Generate–critique–refine until a quality bar is met (or N iterations are reached).

A loop container repeatedly runs a generator and a critic agent: generate → evaluate → refine. It iterates until the output is good enough or an iteration limit is hit.

When: quality is critical and one shot isn't enough — e.g. write the piece, critique it, fix it; until satisfied.

Guard against infinite loops with maxTurns.

Connections (communication)

➡️

Handoff

One step's output becomes the next step's input.

Handoff is the connection that feeds A's output into B as input. It's the core link of sequential pipelines and of the merge/synthesis step.

Example: researcher → writer (the research notes are handed to the writer).

↩️

Report-to

When a subagent finishes, it returns its result (summary) to the orchestrator.

Report-to is a subagent returning its result to the chief. It maps exactly to Claude's real behavior: the subagent works in its own context and returns only its summary to the orchestrator.

That's why there's no direct messaging between subagents; everything flows through the chief, which merges the incoming reports.

⚡

Parallel (concurrent launch)

The chief launches several agents at once, isolated.

A parallel connection means the chief launches multiple agents concurrently and isolated (chief → agents). Parallel agents don't talk to each other; they converge at a synthesis point.

This is the outer-level counterpart of the "parallel" workflow container.

🗃️

Shared (shared file / blackboard)

Agents coordinate via a shared file; each writes its section, others read it.

Since direct messaging isn't possible, when agents need to "communicate" you use a shared file (blackboard) — e.g. workspace/shared.md.

Each agent writes its section, others read it. This is the right model when the user says "agents should share/communicate with each other".

A related real mechanism: the persistent memory directory (memory) given to subagents is also a coordination/accumulation surface.

∞

Dynamic fan-out (unbounded)

Instead of a fixed count you say 'one agent per input item'; the count is decided at runtime.

You can switch a parallel or loop container to Dynamic mode. Then you don't need to know how many children to place up front: you write a collection in the "fan out over" field (e.g. each chapter of the book, each file, each URL).

At runtime the orchestrator spawns one subagent per item in that list — the count is unbounded. The node shows an ∞ badge in the editor, and the exported SKILL.md tells the chief: "spawn one worker agent per item in the list."

When: cases where the count depends on the input, like "chief, create as many parallel agents as this book has chapters."

Parameters

🎯

description — the trigger (most important field)

Where you state WHEN the chief should call this subagent. A weak description = the agent never gets called.

Claude picks a subagent by reading its description. So write it like a trigger: "Use when…".

Good: "Use after code changes to review for security and quality." Weak: "A reviewer." (Unclear when to call it.)

If you want proactive triggering, say so explicitly: "…use proactively." This is written as description in the frontmatter and is required.

🧩

model — which Claude model

haiku (fast/cheap) · sonnet (balanced) · opus (complex planning) · fable (most capable) · inherit (same as parent).

Each subagent can pick its own model. Valid values: sonnet, opus, haiku, fable, a full model ID (e.g. claude-opus-4-8), or inherit. The default is inherit (same model as the main conversation).

Practical choice:

haiku → fast scan, file search, cheap work.
sonnet → speed/intelligence balance; most analysis work.
opus → complex planning, long-horizon agentic work.
fable → the hardest reasoning (most capable, most expensive).
inherit → keep the model the same as the chief.

The easiest cost lever: move non-thinking subagents to haiku.

🎚️

effort — thinking/effort depth

low · medium · high · xhigh · max. Higher = deeper reasoning and more tokens; lower = fast and cheap.

effort tunes how deeply the model thinks and how many tokens it spends. Values: low, medium, high, xhigh, max. The default is usually high.

low → short, scoped, latency-sensitive work (and simple subagents).
medium → cost/quality balance.
high → recommended floor for intelligence-sensitive work.
xhigh → the sweet spot for coding and agentic work (the default in Claude Code).
max → the hardest tasks where correctness beats cost.

max is supported on the Opus-tier family at the high end, not on every model. Lower effort = fewer tool calls, terser output.

🧰

tools / disallowedTools — tool access

Limits which tools the subagent may use. If you don't list any, it inherits all tools.

The tools field is an allowlist of tools the subagent can use; leave it empty and it inherits all tools. disallowedTools removes tools from the inherited/granted list (a denylist).

Claude's real tools: Read, Grep, Glob, LS (read) · Write, Edit, MultiEdit, NotebookEdit (write) · Bash (system) · WebFetch, WebSearch (web) · Agent, Skill, TodoWrite (agent/skill).

Least privilege: e.g. a reviewer should be read-only → only Read, Grep, Glob.

If you grant the Agent tool, you can scope which subagents it may call → Agent(slug1, slug2). To preload a skill rather than listing Skill by hand, use the skills field.

🛡️

permissionMode — permission mode

How much approval the agent's actions require: default · acceptEdits · auto · dontAsk · bypassPermissions · plan.

permissionMode sets how much approval the subagent asks for when running tools. Values:

default → normal permission prompts.
acceptEdits → auto-accept file edits.
auto → proceed automatically where appropriate.
dontAsk → don't ask (use carefully).
bypassPermissions → skip permissions (riskiest — only in a trusted, isolated environment).
plan → plan first, make no changes (read-only exploration).

Choose the risky modes only when you know what you're doing.

💾

memory — persistent memory

Gives the subagent a persistent notes directory across sessions to accumulate learnings.

With memory on, the subagent can read/write a persistent memory directory across sessions (e.g. codebase patterns, recurring issues). Scope options: user, project, local.

When: specialist agents that work in the same project for a long time and should accumulate experience. It's also a loose coordination surface (see shared).

Security: never write secrets/passwords to memory.

⚙️

maxTurns · background · isolation · color

Turn limit, background execution, isolated git worktree, and a visual label color.

maxTurns → the maximum number of agentic turns before the subagent stops. Infinite-loop protection (especially for loop).
background → the agent always runs in the background; doesn't block the main flow.
isolation: worktree → the agent works in an isolated git worktree copy of the repo. Keeps risky/experimental changes out of the main workspace.
color → a visual label color to identify the agent in the /agents UI (red, blue, green…). No functional effect.

🔌

hooks · mcpServers · skills

Event-triggered commands, external tool servers, and preloaded expertise packs.

hooks → shell commands that run on specific events (e.g. PreToolUse, PostToolUse). E.g. run a formatter after every edit. A matcher (e.g. Edit|Write) selects which tools trigger it.
mcpServers → MCP (Model Context Protocol) servers the agent can reach; standard external capabilities (GitHub, Slack, etc.). Referenced by name; credentials are managed separately.
skills → expertise packs (SKILL.md files) to preload into the agent's context. The full content is injected; the agent can still call unlisted skills via the Skill tool.

Deployment targets

🎯

Target LLM: Claude vs Codex/Gemini

Claude has native subagents (.claude/agents); Codex and Gemini are single-agent, so patterns become documented steps.

You can export a skill to different targets:

Claude → native subagent support. The editor generates .claude/agents/*.md files + SKILL.md + .claude/CLAUDE.md. All features (parallel, loop, hooks, MCP, memory…) are active.
Codex (AGENTS.md) and Gemini (GEMINI.md) → currently single-agent. Multi-agent patterns (parallel/loop) are converted to sequential steps; unsupported features like hooks/MCP are dropped.

The editor performs these conversions automatically and shows a warning for unsupported features. That's why this course is based on the Claude architecture — it's the richest.

📦

SKILL.md vs subagent: two kinds of description

The subagent description is the chief's trigger; the skill description is human-readable copy for the marketplace card.

Don't conflate the two "descriptions":

Subagent description → a machine trigger: when should the chief call this agent. Short, condition-focused.
Skill description (SKILL.md / marketplace) → human-readable copy: what this skill does. Shown on the card.

If your deployment target is "subagent" you write the trigger description; for "skill/both" you write the marketplace description. The editor's Inspector shows these as separate fields.

This content is based on the latest Claude Code subagent architecture. If a feature changes, the guide is updated.