1

    skill-evaluation-and-iteration-with-codex

    by Markus Isaksson

    Score, diagnose, and rewrite underperforming AI agent skills to improve triggering and reliability.

    Updated May 2026
    Security scanned
    One-time purchase
    Claude

    $5

    One-time purchase

    30-day refund guarantee

    Secure checkout via Stripe

    Included in download

    • Score existing skills against a strict 9-point quality rubric
    • Fix inconsistent agent behavior by refining output contracts and boundaries
    • file_read, file_write, terminal automation included
    • Ready for Claude
    • Instant install

    Sample Output

    A real example of what this skill produces.

    SKILL EVALUATION REPORT: [image-optimizer-v2] OVERALL SCORE: 6.4/10 ROOT CAUSE: Triggering is too vague; the agent misidentifies CSV files as images. PLAN:

    1. Rewrite Trigger Conditions to exclude non-image extensions.
    2. Define Output Contract for WebP conversion.
    3. Patch SKILL.md.

    About This Skill

    Analyze and Optimize Your AI Agent Skills

    Developing high-quality AI agent skills requires more than just a good prompt. This skill is a developer's diagnostic tool for auditing, scoring, and refining existing SKILL.md files. It bridges the gap between a "functional" skill and a "marketplace-ready" product by identifying why a skill might be failing to trigger, producing inconsistent results, or lacking clear boundaries.

    What it does

    • Quantitative Scoring: Evaluates skills across 9 dimensions including Purpose, Triggering, Guardrails, and Codex Compatibility using a strict 1-10 rubric.
    • Root Cause Analysis: Pinpoints structural failures like placeholder output templates, vague procedures, or over-broad permission scopes.
    • Automated Refinement: Generates a prioritized improvement plan and can directly patch files to bring older skills up to modern Codex and OpenAI standards.
    • Validation: Integrates with local validation scripts to ensure frontmatter and file structures meet distribution requirements.

    Why use this skill

    Prompting an AI to "improve my skill" often results in generic advice. This skill operates as a specialized auditor that understands the specific architecture of Agentic frameworks. It ensures your SKILL.md is written as an executable procedure rather than a list of suggestions, significantly increasing the reliability and conversion rate of your agentic tools.

    📖 Learn more: Best Testing & QA Skills for Claude Code →

    Use Cases

    • Score existing skills against a strict 9-point quality rubric
    • Fix inconsistent agent behavior by refining output contracts and boundaries
    • Update legacy Agensi skills to modern Codex and OpenAI metadata standards
    • Generate prioritized rewrite plans to increase skill marketplace conversion

    Reviews

    No reviews yet - be the first to share your experience.

    Only users who have downloaded or purchased this skill can leave a review.

    Security Scanned

    Passed automated security review

    Permissions

    Read Files
    Write Files
    Terminal / Shell

    File Scopes

    **/SKILL.md
    **/agents/openai.yaml
    **/*.md
    codex/agensi/**
    .codex/skills/**

    **Permission Profile:** Read + Write documentation

    This skill is optimized for Codex acting on local skill files. It can review Grok, Claude, ChatGPT, Cursor, and portable Agensi skills, but Codex-specific validation applies only to Codex skill folders.

    Frequently Asked Questions

    More Premium Skills

    $5

    One-time