agent-regression-guard
by Rian O'Leary
Automated risk classification and regression checking to stop AI agents from breaking your codebase.
New: Software for Agents, always up-to-date, delivered via MCP or web. Browse
THE AGENSI STORE
10 skills found
by Rian O'Leary
Automated risk classification and regression checking to stop AI agents from breaking your codebase.
Generate meaningful, maintainable tests that actually protect your code — not just inflate coverage numbers.
by Shandra
Audits AI agent failures and converts recurring mistakes into durable rules, anti-patterns, regression tests, memory candidates, and improved SKILL.md sections.
Run real Playwright E2E tests on your web app: login, checkout, and form flows across desktop and mobile viewports, with screenshots, traces, and console logs captured on every failure. Catches broken flows and UI regressions before release, and tells you the likely fix, not just that something broke.
An adversarial gate that audits an AI eval or test suite — LLM-judge rubrics, datasets, regression tests, metrics — for gameable criteria, data leakage, missing edge cases, and non-determinism, then returns one PASS/REVISE/FAIL verdict.
Generate runnable accessibility regression tests, not just a findings report. Detects a11y issues, missing alt text, unlabeled controls, keyboard and focus gaps, in your routes, components, or HTML, then emits Playwright + axe-core spec files with targeted assertions and remediation tickets for each. Previews the tests first and writes them only on your confirmation.
One-line summary description Stop your agent from claiming "done" before it's proven. A verification gate that classifies each change by risk (payment, auth, database, user-facing), picks the tests that actually cover it, demands evidence, maps regression risk, and outputs an honest pass/fail report. Turns "looks good to me" into "here's what I ran, and here's what's still unverified."
Audit your frontend build against a performance budget and catch size regressions before you ship. Flags total bundle over budget, initial bundle over budget, individual chunks over a threshold, oversized image assets, source maps shipped to production, and large unminified JavaScript. Reads a webpack or Vite-style stats.json plus a perf-budget.json you control.
by Ikerg
Structural and cell-level diffing for CSV/Excel with schema drift detection and CI-ready exit codes.
by 王晓菲
A systematic 4-phase debugging framework to find root causes, eliminate flaky tests, and prevent regressions.