ai agent production hardening kit
Transform fragile AI prototypes into resilient, enterprise-ready production agents with professional hardening tools.
Ship agent workflows in 30 seconds. Browse 1,500+ expert-built and security scanned skills. Browse skills
THE AGENSI STORE
314 skills found
Transform fragile AI prototypes into resilient, enterprise-ready production agents with professional hardening tools.
Adversarial memory audit to remove PII, stale facts, and injected instructions from agent storage.
by Nex AI
Safe, read-only diagnostic tool for troubleshooting self-hosted stacks, port conflicts, and service failures.
Turns Claude into a senior WordPress launch reviewer that audits a site, theme, or plugin against the entire pre-launch standard across 7 weighted domains and returns one objective go/no-go decision with a scored blocker list.
An adversarial reviewer for Dockerfiles and container builds. It flags root users, image bloat, unpinned or cache-busting layers, leaked secrets, and missing hardening, then returns a PASS / FIX / BLOCK verdict — before you build or push the image.
Audit your project's dependencies for supply-chain risk before they ship. Detects the ecosystem, runs the right vulnerability scanners against live advisory data, and adds the checks tooling misses — outdated or abandoned packages, typosquatted or suspicious names, risky install scripts, and license conflicts — then returns a prioritized fix list and a PASS / REVIEW / BLOCK verdict. It's npm audit with triage and judgment on top.
A senior WordPress theme architect skill that migrates classic PHP themes to FSE block themes — extracting business logic into a companion plugin during conversion — and delivers both a phased reversible plan and the actual converted files.
Stop your agent citing papers that don't exist. Verifies every reference against live PubMed & Crossref — flags fabricated, mismatched, and retracted citations.
Review a database schema, queries, or migration for the mistakes that get expensive in production — bad table design, missing or wrong indexes, slow and N+1 queries, SQL injection, and migrations that lock or break prod. Engine-aware (PostgreSQL, MySQL, SQLite, SQL Server), it runs an ordered review and returns a PASS/REVIEW/BLOCK verdict with prioritized fixes. Schema mistakes are the most expensive kind — this catches them before they ship.
Generate a real test suite for any function, module, or file — meaningful edge cases, error paths, boundary conditions, and proper mocks, not happy-path stubs. Detects your project's framework and conventions, plans the cases deliberately before writing, and hands back runnable tests plus a summary of what's covered. Built to write the tests that actually catch bugs.
Scaffold a secure, spec-compliant MCP server from a description of the tools you want to expose. Sets up the official SDK (TypeScript or Python/FastMCP), defines tools/resources/prompts with strict JSON Schema, wires the right transport (stdio or Streamable HTTP), adds OAuth 2.1 for remote, and hardens against the MCP-specific footguns — prompt injection via tool output, token passthrough, over-broad scopes, command/path/SSRF injection, leaked secrets — before it ships. Returns a runnable skeleton plus a security checklist. Built by someone who's shipped production MCP servers.
Audit AI-assisted medical and pharma content for compliance-readiness before it enters formal MLR review or journal submission. It checks claim substantiation and on-label scope, reference integrity (the acute AI risk: fabricated or misrepresented citations), fair balance and safety, AI-use disclosure, ICMJE authorship and GPP, COI and funding, data integrity and patient privacy, and adverse-event flags — then returns a PASS / REVISE / BLOCK verdict with the must-fix list. A readiness pre-check built for the regulated reality of medical communications — not a replacement for formal review.
by Corey Jacobs
Run a buyer-readiness check before publishing an AI agent skill package.
A DevSecOps engineer that stands up and tunes static analysis (Semgrep, SonarQube, CodeQL) for high-signal findings — picks the right tool for the stack, writes the config and rulesets, wires a sane CI gate, and tunes out the false positives that get scanners muted.
by Ifásola
Specialized static security scanner for MCP servers and Python tool handlers to prevent injection and data leaks.
A professional technical editor's review for your docs. Catches missing context, unclear writing, and unverifiable claims in READMEs, API docs, and changelogs before they ship — with a PASS/REVISE verdict and a prioritized fix list.
Write and review the docs AI agents actually read — AGENTS.md for your repo and llms.txt for your site. Drafts them from scratch or audits existing ones for completeness, clarity, and wasted context, with a PASS or REVISE verdict.
Rewrite dense legal, medical, technical, policy, or financial text into clear plain language at a target reading level — with the meaning fully preserved. A professional plain-language rewrite, not a summary or a list of flags.
Generate consistent API reference docs from your code, OpenAPI spec, or route handlers — per-endpoint parameters, real request and response examples, error codes, auth, and copy-pasteable curl, written for the developer calling the API.
Generate a complete, reader-ready README from your code and project details — not a template dump. It leads with what the project is and why, gives a quickstart that actually runs, and includes only the sections that apply.
Turn your diffs and commit history into commit messages, PR descriptions, and release notes that reviewers and users actually read. One skill, three jobs — conventional-commit compliant, reviewer-ready, and written in plain language.
Design and write the eval suite for your LLM-powered feature — the metrics that match your failure modes, a golden dataset plan with starter cases, anchored rubrics, LLM-as-judge prompts with the known bias mitigations, and pass/fail gates wired for CI.
Adversarially audit your agent hooks before you trust them. Catches command injection, secret leakage, over-broad matchers, destructive actions, and blocking-logic mistakes in pre/post-tool-use, prompt, and stop hooks — with a PASS or REVISE verdict and severity-ranked fixes.
Forensic diagnostic tool that audits prompts and AI products for commercial failure and structural weaknesses.