1

    Enterprise Multi-Agent Automation — Production Harness with Denbun, Retry & Self-Healing

    Battle-tested orchestration framework for running 3+ Claude Code agents in parallel. Covers task routing, denbun handoff protocol, exponential-backoff retry, rate-limit guards, structured JSON logging, and automated self-healing — patterns from real production deployments.

    Updated Jun 2026
    0 installs

    Free

    Included in download

    • Downloadable skill package
    • Works with agent ID, event
    • 1 permission declared
    • Instant install

    Sample input

    Initialize a new multi-agent task to refactor our OAuth2 implementation, ensuring the state is persisted for handoff.

    Sample output

    [denbun] Created denbun_feat_oauth_refactor.md [router] Task assigned to agent3 (Claude Code) [harness] Initialized state in queue/tasks/agent3.yaml [logger] Event: task_start | ID: feat_oauth_refactor | Model: sonnet-3.5 Waiting for agent3 notification via tmux...

    About This Skill

    What it does

    This skill provides a production-grade orchestration framework for managing multiple AI agents working in parallel on a single codebase. It implements the "Denbun Protocol," a robust handoff system that externalizes agent state to prevent context loss during restarts or token compaction. It manages the entire lifecycle of an agentic pipeline, including task routing, error handling, and automated recovery.

    Why use this skill

    Building multi-agent systems is easy in a demo but hard in production. This skill solves common failure modes like rate-limiting, agent stalls, and messy handoffs. Instead of manually prompting agents, you get a repeatable harness that ensures reliability. It’s better than manual prompting because it adds a "self-healing" layer that detects if an agent is stuck and automatically resumes it, saving hours of manual monitoring.

    Supported tools

    • Claude Code: Primary orchestration and reasoning engine.
    • Local LLMs (Gemma/Codex): For cost-efficient subtask routing.
    • Tmux: For persistent session management and event-driven communication.
    • Python/Bash: For the core logic of retries, guards, and logging.

    Expected Output

    The framework produces machine-readable YAML task files, versioned Markdown handoff documents (Denbun), and structured JSON log files that track token usage, latencies, and success rates across your fleet.

    Use Cases

    • Coordinate 3+ agents working simultaneously on the same repository.
    • Implement persistent agent state to survive context compaction and crashes.
    • Automate task routing between high-reasoning and low-cost models.
    • Enforce rate-limit guards and exponential back-off for API stability.
    • Generate structured audit trails for agent costs and performance metrics.

    Reviews

    No reviews yet - be the first to share your experience.

    Only users who have downloaded or purchased this skill can leave a review.

    Security Scanned

    Passed automated security review

    Permissions

    Terminal / Shell

    # Enterprise Multi-Agent Automation ## What You Get A complete, production-ready harness for orchestrating parallel Claude Code agents. Every pattern extracted from real deployments. ## Core Modules **1. Denbun Protocol** — External handoff documents surviving agent restarts and context-compaction. **2. Task Router** — Shell function checking backend health; dispatches to the right agent with graceful fallback. **3. Retry with Exponential Back-off** — Python decorator with full jitter; stops on 403/auth errors immediately. **4. Rate-Limit Guard** — Async token-bucket limiter preventing burst API failures. **5. Structured JSON Logging** — Machine-readable audit trail: timestamp, agent ID, event, metadata. **6. Self-Healer** — Detects stale agent reports (>15 min) and restarts with 10-min cooldown. **7. Communication Protocol** — YAML queue files + tmux send-keys; event-driven, zero-polling. ## Use Cases - 3+ Claude Code agents working in parallel on the same codebase - Multi-agent review pipelines (security + quality + test agents in parallel) - Automated long-running agent loops with reliable handoffs across restarts - Products built on top of Claude Code requiring a repeatable harness ## Requirements Claude Code (claude CLI), tmux, Python 3.9+, Bash ## What's Included Complete SKILL.md with copy-paste-ready code patterns for all 7 modules, each annotated with the failure modes it solves.

    Frequently Asked Questions

    Free