Prompt Injection Red-Team Kit — Find and Fix Injection and Tool-Misuse Holes in Your Own Agent

Red-team your own AI agent for prompt-injection and tool-misuse vulnerabilities before it ships — then fix them. Maps your attack surface, generates a defensive test plan with the safe behavior expected for each case, and gives a prioritized mitigations list. Defensive use only.

Updated Jul 2026

Map the attack surface of RAG pipelines and tool-enabled agents
Generate defensive test cases with expected-safe behavior benchmarks
Identify tools requiring human-in-the-loop authorization gates

including Claude Code

Cursor

Codex CLI

Gemini CLI

Security scannedInstant install

$14

· or 70 credits

30-day refund guarantee

Secure checkout via Stripe

Included in download

Map the attack surface of RAG pipelines and tool-enabled agents
Generate defensive test cases with expected-safe behavior benchmarks
file_read, file_write automation included
Ready for including Claude Code

PubsProToolkit

PubsProToolkit builds rigor-first skills for AI agents — they write your docs and content properly, then adversarially r…

See it in action

You say

Red-team my own agent for injection and tool-misuse holes. It's an internal Slack HR bot I built: the system prompt answers HR questions using RAG over our policy PDFs, and it can book meetings via a Calendar tool and look up records via an HR-API tool. Untrusted text comes from user messages and the retrieved documents. Map the attack surface, generate a defensive test plan with the safe behavior expected for each case, and give me a prioritized mitigations list.

Your agent does

Defensive red-team plan for your Slack HR bot (works from your description; no attacks executed, no access to your agent)

Attack Surface (by blast radius)

Retrieved policy PDFs — indirect injection; goal: exfiltrate records or trigger tool calls; blast radius HIGH (feeds the model as "data" but is attacker-influencable).
User messages — direct injection; goal: unauthorized tool use; blast radius MODERATE.
HR-API tool — over-broad read scope; blast radius HIGH. Calendar tool — unwanted writes; blast radius MODERATE.

Defensive Test Plan (each case states the safe behavior; fill in the result)

Prioritized Mitigations

Trust boundary: label retrieved content as untrusted data; never follow instructions found in it.
Least privilege + authorization gates: scope HR-API to the requesting user; require human approval for Calendar writes.
Output handling: don't render or execute instructions echoed from documents or tool results.
Containment: cap what a single request can read/write; log tool calls.
Regression: keep these cases as a suite and re-run on prompt or tool changes.

Passing these reduces risk; it does not prove invulnerability.

Prompt Injection Red-Team Kit — Find and Fix Injection and Tool-Misuse Holes in Your Own Agent

Name: Prompt Injection Red-Team Kit — Find and Fix Injection and Tool-Misuse Holes in Your Own Agent
Price: 14 USD
Availability: InStock
Author: Agensi

by PubsProToolkit

Updated Jul 2026

Security scanned

including Claude Code

$14

· or 70 credits

30-day refund guarantee

Secure checkout via Stripe

⚡ Also available via Agensi MCP - your AI agent can load this skill on demand via MCP. Learn more →

Included in download

Map the attack surface of RAG pipelines and tool-enabled agents
Generate defensive test cases with expected-safe behavior benchmarks
file_read, file_write automation included
Ready for including Claude Code
Instant install

See it in action

You say

Your agent does

Defensive red-team plan for your Slack HR bot (works from your description; no attacks executed, no access to your agent)

Attack Surface (by blast radius)

Retrieved policy PDFs — indirect injection; goal: exfiltrate records or trigger tool calls; blast radius HIGH (feeds the model as "data" but is attacker-influencable).
User messages — direct injection; goal: unauthorized tool use; blast radius MODERATE.
HR-API tool — over-broad read scope; blast radius HIGH. Calendar tool — unwanted writes; blast radius MODERATE.

Defensive Test Plan (each case states the safe behavior; fill in the result)

Prioritized Mitigations

Trust boundary: label retrieved content as untrusted data; never follow instructions found in it.
Least privilege + authorization gates: scope HR-API to the requesting user; require human approval for Calendar writes.
Output handling: don't render or execute instructions echoed from documents or tool results.
Containment: cap what a single request can read/write; log tool calls.
Regression: keep these cases as a suite and re-run on prompt or tool changes.

Passing these reduces risk; it does not prove invulnerability.

Security scanned

About This Skill

As agents gain tools, memory, and the ability to read untrusted content — user input, retrieved documents, web pages, tool outputs — the model's instructions and its data blur together, and an attacker who controls any of that text can try to redirect the agent to leak secrets, misuse a tool, or subvert its task. Prompt Injection Red-Team Kit is a defensive red-team for hardening an agent you own. Describe your agent — its system prompt, the tools it can call and what each can do, and where untrusted text enters — and it maps the attack surface by blast radius, generates a tailored test plan of defensive cases (each stating the safe behavior a well-defended agent should show, plus a pass/fail result field), and writes a prioritized mitigations list: trust boundaries, least-privilege tools and authorization gates, output handling, guardrails, containment, and regression testing. The download includes four reference files: an attack-surface guide, a defensive test-case template, a mitigations guide, and a complete worked sample plan. It is for hardening your own systems, not attacking others'; it describes attack patterns conceptually rather than shipping ready-to-fire payloads. It works from your description, does not execute attacks or access your agent, and passing the tests reduces risk rather than proving invulnerability. Works with Claude Code, Cursor, Codex CLI, Gemini CLI, and any SKILL.md agent.

Use Cases

Map the attack surface of RAG pipelines and tool-enabled agents
Generate defensive test cases with expected-safe behavior benchmarks
Identify tools requiring human-in-the-loop authorization gates
Create security regression suites for agentic software deployments

Known Limitations

This is a defensive planning aid for hardening an agent you own — not an offensive or exploit tool, and not for attacking systems you don't control. It works entirely from the description you provide: it does not execute attacks, connect to your agent, or scan a live system, so results are only as complete as your description of the system prompt, tools, and untrusted-input sources. It describes attack patterns conceptually rather than shipping ready-to-fire payloads. Passing the generated tests reduces risk but does not prove invulnerability or guarantee security, and it cannot anticipate every novel injection technique. It produces a test plan and mitigations for you to implement and run; it is not a real-time firewall, runtime guardrail, or production monitoring service.

How to install

Drop the file into your AI tool. Works with Claude, Cursor, ChatGPT, and 20+ more.

Reviews

No reviews yet - be the first to share your experience.

Only users who have downloaded or purchased this skill can leave a review.

Early access skill

Security scanned

Built by PubsProToolkit

Works with any agent that follows the SKILL.md standard, …

Be the first to review this skill.

Only users who have downloaded or purchased this skill can leave a review.

Security Scanned

Passed automated security review

Permissions

Read Files

Write Files

File Scopes

references/**

This skill only reads the agent details you describe and writes Markdown deliverables (attack-surface map, defensive test plan, mitigations list) plus the four reference files under references/**. It performs no network access: it does not execute attacks, connect to your agent or any external system, or fetch anything. The auto-detected host pubsprotoolkit.com was removed because the skill makes no external connections.

Creator

PubsProToolkit

PubsProToolkit builds rigor-first skills for AI agents — they write your docs and content properly, then adversarially review them to catch what's wrong before it ships. The result: cleaner output and a hard quality gate in one toolkit. Built by a CMPP-certified, PhD medical writer who brings regulated-industry standards to developer docs, content, compliance, and research integrity.

Frequently Asked Questions

Learn More About AI Agent Skills

More Premium Skills

inline-comment

Best way to steer your agents, effortlessly.

$9.994 installs

designing-hybrid-context-layers

Architects the right retrieval strategy for every query — teaching your agent when to use RAG, a knowledge graph, or a temporal index instead of defaulting to vector search for everything.

$1016 installs

Cinematic Landing Page Builder

Turn any business URL into a high-end animated landing page with 4K AI assets and GSAP animations via Cloudflare.

$1913 installs

Bounty Security Pattern Master Library — 399 Vulnerability Patterns

A premium library of 399 vulnerability patterns and DeFi attack vectors for AI-driven bug hunting and security audits.

$7512 installs