Chapter 18 – Sub-Agents and Multi-Agent Collaboration

This post is part of a series walking through key ideas from my book, Master Claude Chat, Cowork and Code. In the previous chapter we built the governance layer — permissions, approval workflows, hooks, and audit logs. Now we enter Part VII: Advanced Operational Patterns and the Future. Things get architecturally interesting from here.


Why One Agent Isn't Enough

As AI systems grow more complex, a single monolithic agent becomes a bottleneck. One agent trying to handle data analysis, code review, documentation, and infrastructure management simultaneously is like one employee trying to run every department in a company. It can technically attempt everything, but it excels at nothing.

Chapter 18 introduces multi-agent architecture — the practice of decomposing complex problems into specialized sub-problems, each handled by a purpose-built agent. A coordinator agent receives the user's request, identifies which specialists are needed, delegates sub-tasks, and synthesizes the results. It's the same pattern that makes organizations effective: specialized departments with deep expertise, coordinated by management.

The benefits are significant and the book walks through each one: specialization (each agent optimized for its domain with task-specific prompts, tools, and even model selection), scalability (add new specialists without modifying existing ones), fault tolerance (one agent's failure doesn't bring down the system), parallel execution (multiple agents working simultaneously), and explainability (you can trace exactly which agent did what).

Key idea: Multi-agent architecture mirrors how effective organizations work. You don't ask one person to be an expert at everything — you build teams of specialists and coordinate their work. The same principle applies to AI systems.

Spawning Specialized Sub-Agents

A sub-agent is a lightweight Claude instance created for a specific task. Unlike a persistent agent serving all requests, a sub-agent is born, executes its specialized task, and terminates. The book introduces a SubAgentFactory pattern that makes this clean and repeatable.

The chapter walks through three concrete specialist types: a data analysis agent (equipped with statistical tools and configured for trend identification and outlier detection), a code review agent (armed with static analysis and security scanning tools), and a documentation agent (optimized for technical writing with doc generation tools). Each gets its own system prompt, its own tool set, and its own token budget — all tailored to its specialization.

What makes this section practical rather than theoretical is the execution model. The book shows the complete lifecycle: spawn the agent with its configuration, execute it against a specific task, capture the result along with metadata (tokens used, stop reason, timing), and clean up. Error handling is built in — if a sub-agent fails, the system captures the error without crashing the coordinator.


Task Delegation and Parallel Execution

Once you have specialized agents, the coordinator needs to decompose user requests and delegate intelligently. The book presents a coordinator agent pattern that handles this in stages: analyze the request to determine which specialists are needed, formulate sub-tasks for each specialist, identify dependencies between sub-tasks, and execute everything possible in parallel.

The parallel execution model is where real performance gains happen. Consider a request like "Analyze our Q4 sales data, review the new payment processing code, and write API documentation." These three tasks are independent — they can run simultaneously. The coordinator spawns all three specialists at once, executes them in parallel, and only moves to synthesis after all results are in.

But not all tasks are independent. The book also covers dependency management — what happens when one sub-task depends on another's output. The coordinator maintains a dependency graph and executes tasks in waves: first the tasks with no dependencies, then the tasks that depend on the first wave's results, and so on. Circular dependency detection prevents deadlocks.

Important: Parallel execution multiplies both capability and cost. Three agents running simultaneously produce results three times faster but also consume three times the tokens. The book covers how to budget for multi-agent workflows and when parallel execution is worth the cost premium versus sequential processing.

Result Synthesis and Agent Teams

Getting results from multiple specialists is only half the challenge. The other half is synthesizing those results into a coherent, user-friendly output. The coordinator can't just concatenate three agents' responses — it needs to integrate insights, resolve conflicts, organize information logically, and generate a narrative that addresses the original request.

The book introduces a ResultSynthesizer pattern that takes the original user request alongside all sub-agent results and produces a unified response. This synthesis step is itself a Claude call — using the model's ability to integrate diverse information into coherent output.

The chapter then goes further with the concept of agent teams — structured groups where different agents play defined roles. The book presents a four-role team structure: an investigator (gathers information), an analyzer (identifies patterns), a strategist (develops solutions), and a communicator (synthesizes and presents results). Information flows through the team in stages, with each role building on the previous one's work.

Agent teams can also be hierarchical: a lead agent coordinates sub-agents, each of which might coordinate their own specialist teams. This hierarchical decomposition allows handling problems of arbitrary complexity — a pattern the book notes mirrors how large engineering organizations tackle ambitious projects.

Key idea: The progression from single agent to multi-agent to agent teams represents a shift from "AI as tool" to "AI as organization." Each level of sophistication unlocks new categories of problems you can solve.

What I'm Holding Back

I will not spoil the complete SubAgentFactory implementation, the full coordinator agent with its dependency resolution algorithm, the result synthesis prompts, or the hierarchical team decomposition patterns. The book includes working code for every architecture described here — the kind of implementation detail you need to actually build these systems rather than just understand them conceptually.

Ready to build multi-agent systems? Grab the book here for complete implementations of the sub-agent factory, coordinator pattern, parallel execution with dependency management, and agent team architectures that scale to problems of arbitrary complexity.

Next up — Chapter 19: Measuring AI Effectiveness. You've deployed AI agents — but how do you know if they're actually working? We'll explore the metrics framework that answers the questions every team eventually asks: is this saving time, is it accurate, and is it worth the cost?

2026-03-18

Sho Shimoda

I share and organize what I’ve learned and experienced.