Runs an ordered evidence-integrity gate over any AI draft — grade sources, ground claims, verify technical assertions, stress-test — then returns one PASS/REVISE/FAIL ship decision.
1
evidence-integrityhallucinationfact-check+4
Coding Agent Quality Gate — Catch AI Written Security Bugs & Logic Errors Before Deploy
An adversarial reviewer for AI-written code changes. It pressure-tests a pull request or diff for untested branches, silent behavior changes, missing edge cases, over-confident code that only looks right, and weak tests, then returns a PASS / REVISE / BLOCK verdict before the change merges.
Audit your frontend for accessibility violations before release — flags WCAG failures, gives prioritized fixes, and blocks the broken patterns that get sites sued.
Production prompts grow by accretion — every failure gets another appended rule until the prompt is two thousand words of contradictions that the model navigates unpredictably
Turns Claude into a senior WordPress launch reviewer that audits a site, theme, or plugin against the entire pre-launch standard across 7 weighted domains and returns one objective go/no-go decision with a scored blocker list.
An adversarial reviewer for Dockerfiles and container builds. It flags root users, image bloat, unpinned or cache-busting layers, leaked secrets, and missing hardening, then returns a PASS / FIX / BLOCK verdict — before you build or push the image.
1
dockercontainersdevops+2
Dependency & Supply Chain Risk Gate — Catch Vulnerable, Outdated & Typosquatted Packages Before They Ship
Audit your project's dependencies for supply-chain risk before they ship. Detects the ecosystem, runs the right vulnerability scanners against live advisory data, and adds the checks tooling misses — outdated or abandoned packages, typosquatted or suspicious names, risky install scripts, and license conflicts — then returns a prioritized fix list and a PASS / REVIEW / BLOCK verdict. It's npm audit with triage and judgment on top.
Review a database schema, queries, or migration for the mistakes that get expensive in production — bad table design, missing or wrong indexes, slow and N+1 queries, SQL injection, and migrations that lock or break prod. Engine-aware (PostgreSQL, MySQL, SQLite, SQL Server), it runs an ordered review and returns a PASS/REVIEW/BLOCK verdict with prioritized fixes. Schema mistakes are the most expensive kind — this catches them before they ship.
1
sqldatabasepostgresql+7
Medical & Pharma AI Compliance Gate — Pass MLR, Evidence, COI & AI Use Checks Before Your Content Ships
Audit AI-assisted medical and pharma content for compliance-readiness before it enters formal MLR review or journal submission. It checks claim substantiation and on-label scope, reference integrity (the acute AI risk: fabricated or misrepresented citations), fair balance and safety, AI-use disclosure, ICMJE authorship and GPP, COI and funding, data integrity and patient privacy, and adverse-event flags — then returns a PASS / REVISE / BLOCK verdict with the must-fix list. A readiness pre-check built for the regulated reality of medical communications — not a replacement for formal review.
A DevSecOps engineer that stands up and tunes static analysis (Semgrep, SonarQube, CodeQL) for high-signal findings — picks the right tool for the stack, writes the config and rulesets, wires a sane CI gate, and tunes out the false positives that get scanners muted.
1
devsecopssecurity-scanningci-cd+3
Docs Review Gate — A Professional Editor's Review of Your README, API Docs & Changelogs
A professional technical editor's review for your docs. Catches missing context, unclear writing, and unverifiable claims in READMEs, API docs, and changelogs before they ship — with a PASS/REVISE verdict and a prioritized fix list.
1
technical-writingdocumentationreadme+6
AI Feature Eval Writer — Golden Datasets, Rubrics, and LLM as Judge Prompts That Actually Catch Regressions
Design and write the eval suite for your LLM-powered feature — the metrics that match your failure modes, a golden dataset plan with starter cases, anchored rubrics, LLM-as-judge prompts with the known bias mitigations, and pass/fail gates wired for CI.
1
evalsllm-evaluationllm-as-judge+6
Agent Hooks Security and Quality Gate — Audit Your Pre and Post Tool Use Hooks Before They Ship
Adversarially audit your agent hooks before you trust them. Catches command injection, secret leakage, over-broad matchers, destructive actions, and blocking-logic mistakes in pre/post-tool-use, prompt, and stop hooks — with a PASS or REVISE verdict and severity-ranked fixes.
2
agent-hookssecurityclaude-code+6
Background Agent Task Brief Writer — Delegate to Unattended Agents Without the Surprise PR
Write the delegation brief that lets a background or async agent succeed unattended — precise goal, hard constraints, testable acceptance criteria, a verification plan, and stop-and-escalate rules. Turns "go fix the flaky tests" into a spec an agent can actually execute.
1
delegationbackground-agentsasync-agents+6
Fleet Scale Migration Orchestrator Human Supervised
Review an AI-generated code diff for the failure modes coding agents actually have — claimed-done-but-not-done, gamed or weakened tests, stubs passed off as complete, silent scope creep, hallucinated APIs, and security regressions. Returns an APPROVE or REQUEST CHANGES verdict with a completion check and severity-ranked fixes.