rag-architect
Design, debug, and optimize production RAG systems with expert architecture, hybrid search, and grounding strategies.
by Roy Yuen
About This Skill
Advanced RAG System Architecture & Debugging
Designing a production-ready Retrieval-Augmented Generation (RAG) system requires more than just a vector database and a prompt. The RAG Architect skill provides a developer-centric framework for building, hardening, and troubleshooting complex retrieval stacks, moving beyond generic implementations to high-performance architecture.
What it does
This skill acts as a senior systems architect for your AI pipeline. It analyzes ingestion workflows, document parsing, chunking strategies, embedding selection, and vector store performance. Whether you are building from scratch or fixing a broken implementation, it applies a rigorous, evidence-based methodology to ensure your agent stays grounded and accurate.
Supported Capabilities
- Architecture Design: Decisions for hybrid search, reranking, and context packing tailored to your specific corpus (Legal, Code, Product Docs, etc.).
- Truth-First Debugging: Systematic isolation of failures across the pipeline—from bad parsing to stale indexes and tenant leakage.
- Infrastructure Selection: Unbiased tradeoff analysis for vector databases (pgvector, Qdrant, Milvus), embedding models, and rerankers.
- Production Hardening: Implementing multi-tenant isolation, citation grounding, and incremental re-indexing strategies.
- Evaluation Frameworks: Establishing metrics for recall@k, precision, and faithfulness to ensure changes are data-driven rather than anecdotal.
Why use this skill?
Standard LLM prompts often treat "bad answers" as model hallucinations. This skill identifies when the problem is actually a metadata filter mismatch, poor chunking semantics, or an inefficient reranker. It helps you reduce latency and cost by optimizing the weakest stage of your pipeline rather than over-relying on expensive long-context windows.
How to Install
unzip rag-architect.zip -d ~/.claude/skills/$5
One-time purchase • Own forever
Security Scanned
Passed automated security review
Permissions
No special permissions declared or detected
Tags
Frequently Asked Questions
Learn More About AI Agent Skills
Similar Skills
designing-hybrid-context-layers
Architects the right retrieval strategy for every query — teaching your agent when to use RAG, a knowledge graph, or a temporal index instead of defaulting to vector search for everything.
diagnosing-rag-failure-modes
RAG fails quietly. It retrieves documents, returns confident-looking answers, and misses the question entirely — because the question required connecting facts across documents, reasoning about sequence, or tracing causation. This skill gives you a five-question diagnostic checklist that classifies any failing query as either RAG-safe or structurally RAG-incompatible, then maps it to the specific failure pattern and the architectural fix that resolves it.
env-doctor
Diagnoses why your project will not start. Checks runtime versions, dependencies, environment variables, databases, ports, and build artifacts systematically.
benchmarking-ai-agents-beyond-models
Published AI benchmarks measure brains in jars. They test models in isolation or within a single reference harness — and then attribute all performance to the model. This skill teaches you to decompose agent performance into its two actual components: model capability and harness multiplier. The result is evaluations that predict real-world behavior instead of benchmark theater.