flaky-test-detector
by Zicheng Liao
Detect, diagnose, and fix intermittent test failures to stabilize your CI pipeline and restore developer trust.
- Eliminate 'sleep' calls and race conditions in async test suites.
- Identify tests that fail only when run in parallel or specific orders.
- Automate the quarantine of flaky tests to unblock the main merge pipeline.
Secure checkout via Stripe
Included in download
- Eliminate 'sleep' calls and race conditions in async test suites.
- Identify tests that fail only when run in parallel or specific orders.
- terminal, browser, network automation included
- Includes example output and usage patterns
See it in action
A real example of what this skill takes in and produces.
Sample output
FLAKY TEST DETECTED: test_auth_flow Failure rate: 12% (6/50 runs) Root Cause: Race condition in async session store. Diagnosis: Fails under high load (-n auto) but passes in isolation. Recommended Fix: Replace time.sleep(2) with poll-based wait_for(condition).
flaky-test-detector
by Zicheng Liao
Detect, diagnose, and fix intermittent test failures to stabilize your CI pipeline and restore developer trust.
Secure checkout via Stripe
Included in download
- Eliminate 'sleep' calls and race conditions in async test suites.
- Identify tests that fail only when run in parallel or specific orders.
- terminal, browser, network automation included
- Includes example output and usage patterns
- Instant install
See it in action
A real example of what this skill takes in and produces.
Sample output
FLAKY TEST DETECTED: test_auth_flow Failure rate: 12% (6/50 runs) Root Cause: Race condition in async session store. Diagnosis: Fails under high load (-n auto) but passes in isolation. Recommended Fix: Replace time.sleep(2) with poll-based wait_for(condition).
About This Skill
What it does
The Flaky Test Detector is a specialized diagnostic skill designed to stabilize your CI/CD pipeline by identifying and resolving non-deterministic test failures. It systematically hunts for the root causes of "intermittent" failures—tests that pass or fail without any code changes—across any language or framework including Python (pytest), JavaScript (Jest), Go, and Java (JUnit).
Why use this skill
Manually debugging flaky tests is a massive time sink. This skill automates the investigative heavy lifting: it runs stress tests with rerun loops, analyzes CI logs for environmental patterns, and scans your codebase for anti-patterns like hardcoded sleeps, shared mutable state, and unmocked external dependencies. It doesn't just find the flakiness; it provides standardized fixes and quarantine strategies to unblock your team immediately.
Supported Features
- Detection Loops: Automated execution of suspected tests (50+ runs) to calculate failure rates.
- Isolation Testing: Detects shared state by running tests in isolation vs. parallel execution.
- Pattern Scanning: Greps for problematic syntax like
Math.random(),sleep(), or global fixtures. - Automatic Quarantining: Generates standard
@skipor@mark.flakywrappers and tracking registries. - CI Integration: Provides YAML configurations for GitHub Actions to detect and report flakes automatically.
Use Cases
- Eliminate 'sleep' calls and race conditions in async test suites.
- Identify tests that fail only when run in parallel or specific orders.
- Automate the quarantine of flaky tests to unblock the main merge pipeline.
- Detect environment-specific failures caused by timezones or locale differences.
How to Install
mkdir -p ~/.claude/skills && curl -sL https://www.agensi.io/api/install/flaky-test-detector | tar xz -C ~/.claude/skills/Free skills install directly. Paid skills require purchase - use the download button above after buying.
Reviews
No reviews yet - be the first to share your experience.
Only users who have downloaded or purchased this skill can leave a review.
Early access skill
Be the first to review this skill.
Only users who have downloaded or purchased this skill can leave a review.
Security Scanned
Passed automated security review
Permissions
Allowed Hosts
File Scopes
Creator
Frequently Asked Questions
Learn More About AI Agent Skills
More Premium Skills
diagnosing-rag-failure-modes
RAG fails quietly. It retrieves documents, returns confident-looking answers, and misses the question entirely — because the question required connecting facts across documents, reasoning about sequence, or tracing causation. This skill gives you a five-question diagnostic checklist that classifies any failing query as either RAG-safe or structurally RAG-incompatible, then maps it to the specific failure pattern and the architectural fix that resolves it.
skill-router-2
Automatically detect, load, and stack the perfect skills combo for any user request.
cinematic-sites
Turn any basic business URL into a high-end cinematic landing page with AI-generated 4K assets and GSAP animations.
software-architect
A structured framework for planning, reviewing, and evolving complex software systems with explicit trade-offs.