harness-engineering
by Roy Yuen
Build production-grade AI harnesses with explicit control contracts, verification loops, and adversarial testing.
- Implement multi-pass plan/execute/verify loops for complex agent tasks.
- Design safety gates and adversarial test suites for AI tool boundaries.
- Create stateful replay tests to debug agent regressions in production.
$8
One-time purchase · Own forever
Included in download
- Implement multi-pass plan/execute/verify loops for complex agent tasks.
- Design safety gates and adversarial test suites for AI tool boundaries.
- Includes example output and usage patterns
See it in action
Contract: Verifier must run before final output. [Action] Patched executor gateway in loop.ts [Test] Replay scenario_04: REPRODUCED skip behavior. [Test] Replay scenario_04 (Post-fix): VERIFIED gate enforcement. Result: Verified fix. No regressions in stateful memory buffer.
harness-engineering
by Roy Yuen
Build production-grade AI harnesses with explicit control contracts, verification loops, and adversarial testing.
$8
One-time purchase · Own forever
Included in download
- Implement multi-pass plan/execute/verify loops for complex agent tasks.
- Design safety gates and adversarial test suites for AI tool boundaries.
- Includes example output and usage patterns
- Instant install
- One-time purchase
See it in action
Contract: Verifier must run before final output. [Action] Patched executor gateway in loop.ts [Test] Replay scenario_04: REPRODUCED skip behavior. [Test] Replay scenario_04 (Post-fix): VERIFIED gate enforcement. Result: Verified fix. No regressions in stateful memory buffer.
About This Skill
Advanced AI Control & Testing
Building reliable AI agents requires more than just good prompting; it requires robust engineering around the model. This skill provides a specialized framework for designing, debugging, and hardening AI harnesses—the scaffolding that governs how an agent plans, executes, and verifies its work. It solves the common problem of agents "going off the rails," skipping safety checks, or providing unverified results.
What it does
The Harness Engineering skill implements a structured methodology for agent orchestration. It allows you to build sophisticated control loops using a multi-role architecture:
- Planner: Defines contracts and stop rules.
- Executor: Performs bounded actions.
- Verifier: Validates results against evidence.
- Critic/Recovery: Identifies regressions and manages error state.
Why use this skill
Unlike standard prompting, this skill enforces explicit contracts and authority boundaries. It uses a "Validation Ladder" approach to move from simple schema checks to complex adversarial testing and stateful loop replays. You get high-integrity outputs with a clear audit trail, labeled by confidence levels: Verified, Inferred, or Unknown.
It is ideally suited for developers building production-grade agentic workflows, eval pipelines, or safety-critical tool boundaries where "hallucination" is not an option.
📖 Learn more: Best Testing & QA Skills for Claude Code →
Use Cases
- Implement multi-pass plan/execute/verify loops for complex agent tasks.
- Design safety gates and adversarial test suites for AI tool boundaries.
- Create stateful replay tests to debug agent regressions in production.
- Standardize agent reporting using Verified, Inferred, and Unknown status.
How to Install
unzip harness-engineering.zip -d ~/.claude/skills/Reviews
No reviews yet — be the first to share your experience.
Only users who have downloaded or purchased this skill can leave a review.
Early access skill
Be the first to review this skill.
Only users who have downloaded or purchased this skill can leave a review.
Security Scanned
Passed automated security review
Permissions
No special permissions declared or detected
Frequently Asked Questions
Learn More About AI Agent Skills
Similar Skills
ai-productivity
High-speed intake for shaping vague prompts, triaging complex tasks, and compressing context for efficient execution.
prompt-engineer
Professional prompt engineering patterns for building robust, secure, and production-ready LLM applications.
code-reviewer
Reviews your code for bugs, security vulnerabilities, logic errors, performance issues, and style violations. Organizes findings by severity and suggests fixes with code examples.
git-commit-writer
Writes conventional commit messages by analyzing your staged git changes. Detects commit type, scope, and breaking changes automatically.