1
    Agent Failure Recovery Protocol

    Agent Failure Recovery Protocol

    by PubsProToolkit

    A structured recovery framework to stop agent loops, handle malformed output, and manage autonomous error escalation.

    Updated Jun 2026
    Security scanned
    Codex CLI

    $18

    · or 90 credits

    30-day refund guarantee

    Secure checkout via Stripe

    Included in download

    • Prevent runaway token spend from infinite agent loops
    • Decompose complex tasks when sub-agents fail to deliver results
    • file_read automation included
    • Ready for Codex CLI
    • Instant install

    Sample input

    The research agent has provided three identical empty responses in a row. Apply the recovery protocol to fix this task loop.

    Sample output

    Failure Signal: Loop/Stalled (3x empty responses) Recovery Action: Decompose (Breaking 'Market Research' into 'Identify Competitors' and 'Feature Comparison') Circuit Breaker: 2/3 attempts used. Ledger Entry: Task failed due to broad scope; decomposed into smaller steps to bypass stall.

    About This Skill

    Self-Healing Agent Workflows

    The Agent Failure Recovery Protocol is a specialized logic layer for multi-agent systems and long-running autonomous tasks. It solves the "zombie agent" problem—where agents loop, stall, or drift off-task—by providing a structured decision-making framework for error handling. Instead of simple try/catch blocks, it implements a sophisticated recovery logic that preserves your token budget and keeps projects moving forward.

    What it does

    • Failure Classification: Instantly identifies if an agent is stalled, looping, or producing malformed output.
    • Graded Recovery Ladder: Implements a tiered response system that starts with cheap reframing before escalating to task decomposition or human intervention.
    • Circuit Breakers: Hard limits that prevent runaway loops and protect your API budget from infinite retries.
    • Audit Trails: Maintains a failure ledger to track recurring bottlenecks and improve workflow architecture over time.

    Why use this skill?

    Standard prompting leaves agents brittle; when they hit an error, they often repeat it until they hit a context limit. This skill adds a "supervisory" layer to your agent's reasoning, making it more resilient and professional. It's tool-agnostic, working with any agent that reads the SKILL.md standard, including Claude Code, Codex CLI, Cursor, VS Code Copilot, and Gemini CLI. The output is a clear, auditable log of why a failure happened and exactly how it was resolved.

    Use Cases

    • Prevent runaway token spend from infinite agent loops
    • Decompose complex tasks when sub-agents fail to deliver results
    • Maintain a professional audit log of all autonomous system failures
    • Recover gracefully from malformed or off-contract agent output

    Reviews

    No reviews yet - be the first to share your experience.

    Only users who have downloaded or purchased this skill can leave a review.

    Security Scanned

    Passed automated security review

    Permissions

    Read Files

    This skill is a single SKILL.md instruction file. The agent only needs to read the file to load the protocol. It runs no code and requires no terminal, network, write, or environment access.

    Works with any agent that reads the SKILL.md standard (Claude Code, Codex CLI, Cursor, VS Code Copilot, Gemini CLI, and more). Pure instructions — no runtime, dependencies, or setup beyond loading the file. Best paired with an orchestrator or long-running agent loop.

    Creator

    PubsProToolkit builds AI agent skills that bring regulated-industry rigor to written output. Created by a CMPP-certified medical writer with a PhD and 10+ years in pharma — covering clinical and scientific publishing, plus evidence-grounded QC for any agent.

    Frequently Asked Questions

    More Premium Skills

    $18