THE AGENSI STORE

Browse The Skill Store

1 skill found

AI Feature Eval Writer — Golden Datasets, Rubrics, and LLM-as-Judge Prompts That Actually Catch Regressions

$14

Design and write the eval suite for your LLM-powered feature — the metrics that match your failure modes, a golden dataset plan with starter cases, anchored rubrics, LLM-as-judge prompts with the known bias mitigations, and pass/fail gates wired for CI.

evalsllm-evaluationllm-as-judge+6