Test Flake Hunter
Detect and analyze flaky tests across multiple frameworks with automated repeated execution and severity reporting.
- Locate intermittent failures that pass on local but fail in CI pipelines.
- Quantify test reliability using flake ratios and severity classifications.
- Generate structured JSON reports of test stability for engineering dashboards.
$7
· or 35 creditsSecure checkout via Stripe
Included in download
- Locate intermittent failures that pass on local but fail in CI pipelines.
- Quantify test reliability using flake ratios and severity classifications.
- terminal, file_read automation included
Sample input
I've noticed some intermittent failures in our authentication tests. Can you run the tests in tests/auth/ 20 times and tell me which ones are flaky?
Sample output
Found 1 flaky test in tests/auth/: - test_session_expiry: 14 passes, 6 failures (30% flake ratio). Severity: High. Suggested cause: Race condition in session cleanup timing. Detailed reports saved to flake-report.md and flake-report.json.
Test Flake Hunter
Detect and analyze flaky tests across multiple frameworks with automated repeated execution and severity reporting.
$7
· or 35 creditsSecure checkout via Stripe
Included in download
- Locate intermittent failures that pass on local but fail in CI pipelines.
- Quantify test reliability using flake ratios and severity classifications.
- terminal, file_read automation included
- Instant install
Sample input
I've noticed some intermittent failures in our authentication tests. Can you run the tests in tests/auth/ 20 times and tell me which ones are flaky?
Sample output
Found 1 flaky test in tests/auth/: - test_session_expiry: 14 passes, 6 failures (30% flake ratio). Severity: High. Suggested cause: Race condition in session cleanup timing. Detailed reports saved to flake-report.md and flake-report.json.
Screenshots
About This Skill
What it does
Test Flake Hunter is a diagnostic tool designed to identify non-deterministic test failures. It automatically detects your testing framework—supporting pytest, Jest, and Go test—and executes your suite or specific test files multiple times to uncover "flaky" behavior. By comparing pass/fail patterns across runs, it calculates flake ratios and provides structured reports on test reliability.
Why use this skill
Manually re-running tests to catch intermittent failures is tedious and prone to human error. This skill automates the repetition, normalizes output from different runners, and applies severity scoring to help you prioritize fixes. It goes beyond simple "pass/fail" by analyzing error messages and execution patterns to suggest likely root causes, such as network timeouts or race conditions.
Supported Tools
- Python: pytest, py.test
- JavaScript/TypeScript: Jest
- Go: go test
- Generic: Any runner with standard exit codes via Makefile or custom commands
Output format
The skill produces two primary artifacts: a human-readable flake-report.md summary for quick review, and a structured flake-report.json for integration into CI/CD pipelines or further data analysis. Reports include flake ratios, failure message snippets, and suggested remediation steps.
Use Cases
- Locate intermittent failures that pass on local but fail in CI pipelines.
- Quantify test reliability using flake ratios and severity classifications.
- Generate structured JSON reports of test stability for engineering dashboards.
- Debug race conditions by isolating and repeatedly running specific test files.
How to Install
mkdir -p ~/.claude/skills && curl -sL https://www.agensi.io/api/install/test-flake-hunter -o /tmp/test-flake-hunter.zip && unzip -o /tmp/test-flake-hunter.zip -d ~/.claude/skills && rm /tmp/test-flake-hunter.zipFree skills install directly. Paid skills require purchase - use the download button above after buying.
Reviews
No reviews yet - be the first to share your experience.
Only users who have downloaded or purchased this skill can leave a review.
Be the first to review this skill.
Only users who have downloaded or purchased this skill can leave a review.
Security Scanned
Passed automated security review
Permissions
File Scopes