About This Skill
# Multi-Agent Orchestration
Multi-agent systems that work in a demo but break in production. This skill is the playbook for building, debugging, and operating them at scale — with patterns that hold up when one agent is slow, one is hallucinating, and your budget is on fire.
## What it does
A complete reference for production multi-agent systems:
- **12 proven orchestration patterns** (supervisor, peer-to-peer, pipeline, blackboard, marketplace, etc.) with tradeoffs
- **8 anti-patterns** that look fine in a demo but fail in prod (with examples)
- **Reference implementation** — TypeScript & Python for the top 3 patterns
- **Debugging workflow** — "agents stuck in loop", "agent A won't pass to agent B", "context lost between turns"
- **Cost control** — token budgets per agent, circuit breakers, early exit
- **Failure recovery** — checkpointing, replay, human-in-the-loop fallback
- **Observability** — trace propagation, agent attribution, replay debugging
## When to use it
- You're building a multi-agent system and don't know which topology to use
- Your agents loop forever or run out of context
- Costs are unpredictable — one bad run costs $50
- You can't tell which agent is the bottleneck
- You need to add a human-in-the-loop step
- You're migrating from single-agent to multi-agent
## Why it's better than ad-hoc prompting
Most "how do I build multi-agent" prompts give toy examples. This skill is different:
- **Production-tested** — every pattern is from a real system
- **Quantified** — pattern selection has a decision tree, not vibes
- **Reference code** — copy-paste starting points in TS & Python
- **Anti-patterns named** — know what to avoid before you hit it
- **Debugging workflow** — actual steps to fix stuck agents
## Architecture
```
┌─────────────────────────────────────────────────────────┐
│ Agent (Claude/Cursor) │
│ - Reads user's multi-agent design │
│ - Recommends pattern via decision tree │
│ - Helps implement and debug │
└───────────────┬─────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ skills/multi-agent-orchestration/ │
│ scripts/ │
│ ├── pattern_suggester.py # Decision tree │
│ ├── topology_diagram.py # Mermaid generation │
│ ├── cost_estimator.py # Token budget per agent │
│ ├── loop_detector.py # Static analysis │
│ └── trace_analyzer.py # OTLP log parser │
│ references/ │
│ ├── patterns.md # 12 patterns detailed │
│ ├── anti-patterns.md # 8 to avoid │
│ ├── decision-tree.md # Which pattern when │
│ └── debugging.md # Stuck-loop fixes │
│ examples/ │
│ ├── supervisor/ # TS + Python reference │
│ ├── pipeline/ │
│ └── blackboard/ │
└─────────────────────────────────────────────────────────┘
```
## Quick start
```bash
# 1. Install
pip install mermaid-py pyyaml tiktoken
# 2. Get pattern recommendation
python scripts/pattern_suggester.py --requirements requirements.yaml
# 3. Estimate cost
python scripts/cost_estimator.py --pattern supervisor --agents 3 --turns 10
# 4. Detect loops in your code
python scripts/loop_detector.py src/agents/
# 5. Analyze traces
python scripts/trace_analyzer.py traces.jsonl
```
## The 12 patterns (summary)
| # | Pattern | When to use | Cost | Latency |
|---|---------|-------------|------|---------|
| 1 | **Supervisor** | One clear owner, others are workers | Med | Med |
| 2 | **Peer-to-peer** | All agents equal, no central authority | High | High |
| 3 | **Pipeline** | Sequential processing, each step specialized | Low | Low |
| 4 | **Blackboard** | Multiple specialists share a common state | High | Med |
| 5 | **Marketplace** | Agents bid on tasks, best wins | High | High |
| 6 | **Hierarchical** | Tree of supervisors (org chart) | Med | Med |
| 7 | **Reflection** | Agent critiques its own output, iterates | High | High |
| 8 | **Tool-use** | One agent, many tools (not really multi-agent) | Low | Low |
| 9 | **Debate** | Two agents argue, judge picks winner | Very High | Very High |
| 10 | **Ensemble** | N agents solve same task, vote | Very High | High |
| 11 | **Router** | One router dispatches to N specialists | Low | Low |
| 12 | **Hybrid** | Mix of above, e.g. router → supervisor | Med | Med |
## The 8 anti-patterns (with fixes)
1. **Loop without exit** — agents call each other with no termination condition
→ Fix: explicit max turns + circuit breaker
2. **Context stuffing** — pass full conversation to every agent
→ Fix: per-agent context distillation, only relevant slice
3. **Same model, different prompts = "different agents"**
→ Fix: actually use different models or fine-tunes for different roles
4. **Synchronous calls in a loop** — each agent blocks the next
→ Fix: async/await or message queue
5. **No shared state** — agents re-derive context every turn
→ Fix: blackboard pattern with versioned state
6. **Shared mutable state** — agents corrupt each other's data
→ Fix: immutable state + copy-on-write, or actor model
7. **No cost controls** — one bad input → 10 agents × 10 turns = $200
→ Fix: per-agent token budget + global circuit breaker
8. **No human escape hatch** — agents loop forever
→ Fix: max turns + "ask human" tool + always-available abort
## Reference implementations
### Supervisor (TypeScript)
```typescript
import { StateGraph } from "@langchain/langgraph";
const graph = new StateGraph<{ messages: any[]; next: string }>()
.addNode("supervisor", supervisorNode)
.addNode("researcher", researcherNode)
.addNode("writer", writerNode)
.addEdge("__start__", "supervisor")
.addConditionalEdges("supervisor", routeNext, {
researcher: "researcher",
writer: "writer",
FINISH: "__end__"
})
.addEdge("researcher", "supervisor")
.addEdge("writer", "supervisor");
export const app = graph.compile();
```
### Pipeline (Python)
```python
from typing import TypedDict
from langgraph.graph import StateGraph, START, END
class State(TypedDict):
raw_input: str
cleaned: str
analyzed: dict
report: str
def clean(s): s["cleaned"] = clean_fn(s["raw_input"]); return s
def analyze(s): s["analyzed"] = analyze_fn(s["cleaned"]); return s
def report(s): s["report"] = report_fn(s["analyzed"]); return s
g = StateGraph(State)
g.add_node("clean", clean)
g.add_node("analyze", analyze)
g.add_node("report", report)
g.add_edge(START, "clean")
g.add_edge("clean", "analyze")
g.add_edge("analyze", "report")
g.add_edge("report", END)
app = g.compile()
```
## Pricing
Single-purchase, lifetime access. $16.00.
Includes:
- 5 Python scripts (suggestion, cost, detection, tracing, diagrams)
- 4 reference docs (12 patterns, 8 anti-patterns, decision tree, debugging)
- 3 reference implementations (supervisor, pipeline, blackboard in TS + Python)
- Sample trace logs for debugging practice
- Future updates for the same major version
## Example usage
> "I'm building a research agent that uses 3 sub-agents. They keep looping. Help me fix it and pick the right topology."
The skill will:
1. Read your current code
2. Run the loop detector (likely finds 1-2 of the 8 anti-patterns)
3. Suggest the right topology (supervisor / pipeline / blackboard)
4. Estimate cost per run
5. Output a refactored design with the fix
## Compatibility
Works with any agent that supports the SKILL.md standard and can execute Python: Claude Code, OpenClaw, Codex CLI, Cursor, Gemini CLI, Cline, Windsurf, Aider. Reference code uses LangGraph but patterns apply to any framework (AutoGen, CrewAI, raw function calling). Tested on Linux, macOS, Windows.
## Tags
agents, llm, orchestration, langgraph, autogen, crewai, multi-agent, ai-architecture, agent-ops