flaky-test-detector
by Timoranjes
Detect, diagnose, and fix intermittent test failures to stabilize your CI pipeline and restore developer trust.
- Eliminate 'sleep' calls and race conditions in async test suites.
- Identify tests that fail only when run in parallel or specific orders.
- Automate the quarantine of flaky tests to unblock the main merge pipeline.
$5
· or 25 creditsSecure checkout via Stripe
Included in download
- Eliminate 'sleep' calls and race conditions in async test suites.
- Identify tests that fail only when run in parallel or specific orders.
- terminal, browser, network automation included
- Ready for Claude Code
Sample input
Investigate why test_auth_flow is failing intermittently in the CI pipeline. Run a detection loop and provide a diagnosis.
Sample output
FLAKY TEST DETECTED: test_auth_flow Failure rate: 12% (6/50 runs) Root Cause: Race condition in async session store. Diagnosis: Fails under high load (-n auto) but passes in isolation. Recommended Fix: Replace time.sleep(2) with poll-based wait_for(condition).
flaky-test-detector
by Timoranjes
Detect, diagnose, and fix intermittent test failures to stabilize your CI pipeline and restore developer trust.
$5
· or 25 creditsSecure checkout via Stripe
Included in download
- Eliminate 'sleep' calls and race conditions in async test suites.
- Identify tests that fail only when run in parallel or specific orders.
- terminal, browser, network automation included
- Ready for Claude Code
- Instant install
Sample input
Investigate why test_auth_flow is failing intermittently in the CI pipeline. Run a detection loop and provide a diagnosis.
Sample output
FLAKY TEST DETECTED: test_auth_flow Failure rate: 12% (6/50 runs) Root Cause: Race condition in async session store. Diagnosis: Fails under high load (-n auto) but passes in isolation. Recommended Fix: Replace time.sleep(2) with poll-based wait_for(condition).
About This Skill
What it does
The Flaky Test Detector is a specialized diagnostic skill designed to stabilize your CI/CD pipeline by identifying and resolving non-deterministic test failures. It systematically hunts for the root causes of "intermittent" failures—tests that pass or fail without any code changes—across any language or framework including Python (pytest), JavaScript (Jest), Go, and Java (JUnit).
Why use this skill
Manually debugging flaky tests is a massive time sink. This skill automates the investigative heavy lifting: it runs stress tests with rerun loops, analyzes CI logs for environmental patterns, and scans your codebase for anti-patterns like hardcoded sleeps, shared mutable state, and unmocked external dependencies. It doesn't just find the flakiness; it provides standardized fixes and quarantine strategies to unblock your team immediately.
Supported Features
- Detection Loops: Automated execution of suspected tests (50+ runs) to calculate failure rates.
- Isolation Testing: Detects shared state by running tests in isolation vs. parallel execution.
- Pattern Scanning: Greps for problematic syntax like
Math.random(),sleep(), or global fixtures. - Automatic Quarantining: Generates standard
@skipor@mark.flakywrappers and tracking registries. - CI Integration: Provides YAML configurations for GitHub Actions to detect and report flakes automatically.
Use Cases
- Eliminate 'sleep' calls and race conditions in async test suites.
- Identify tests that fail only when run in parallel or specific orders.
- Automate the quarantine of flaky tests to unblock the main merge pipeline.
- Detect environment-specific failures caused by timezones or locale differences.
Known Limitations
- Cannot diagnose hardware-level flakes or intermittent network outages.
- Efficiency depends on test suite execution speed for large rerun loops.
How to Install
mkdir -p ~/.claude/skills && curl -sL https://www.agensi.io/api/install/flaky-test-detector -o /tmp/flaky-test-detector.zip && unzip -o /tmp/flaky-test-detector.zip -d ~/.claude/skills && rm /tmp/flaky-test-detector.zipFree skills install directly. Paid skills require purchase - use the download button above after buying.
Reviews
No reviews yet - be the first to share your experience.
Only users who have downloaded or purchased this skill can leave a review.
Early access skill
Be the first to review this skill.
Only users who have downloaded or purchased this skill can leave a review.
Security Scanned
Passed automated security review
Permissions
Allowed Hosts
File Scopes
Claude Code, Cursor, Aider, GitHub Copilot Workspace
Creator
Frequently Asked Questions
Learn More About AI Agent Skills
More Premium Skills
Multi-Agent Orchestration Master Library
Transform Claude Code into a coordinated multi-agent system. Battle-tested tmux orchestration patterns, YAML task queues, event-driven communication, and parallel worker management for 8+ agents.
cinematic-sites
Turn any basic business URL into a high-end cinematic landing page with AI-generated 4K assets and GSAP animations.
endless-loop
Autonomous research and task loop that builds on previous findings to solve complex objectives while you sleep.
diagnosing-rag-failure-modes
RAG fails quietly. It retrieves documents, returns confident-looking answers, and misses the question entirely — because the question required connecting facts across documents, reasoning about sequence, or tracing causation. This skill gives you a five-question diagnostic checklist that classifies any failing query as either RAG-safe or structurally RAG-incompatible, then maps it to the specific failure pattern and the architectural fix that resolves it.