Rag Architect
by Roy Yuen
Design, debug, and optimize production RAG systems with expert architecture, hybrid search, and grounding strategies.
- Construct hybrid search pipelines combining semantic and keyword retrieval
- Debug hallucination risks by implementing strict source grounding protocols
- Optimize indexing strategies for low-latency document retrieval at scale
$5
· or 25 creditsSecure checkout via Stripe
Included in download
- Construct hybrid search pipelines combining semantic and keyword retrieval
- Debug hallucination risks by implementing strict source grounding protocols
- Ready for Claude Code
- Includes example output and usage patterns
Sample input
Our RAG system isn't finding specific technical errors like 'error 402' in the logs, even though the docs are indexed. Can you diagnose why retrieval is failing for these identifiers?
Sample output
Diagnosis: Low recall@k. Hypothesis: Missing BM25/keyword search for technical identifiers. Evidence: Search 'error 402' returns generic HTTP docs, not specific logs. Fix: Implement Hybrid Search with RRF + Metadata filters for log levels. Expected Impact: +25% precision on technical queries.
Rag Architect
by Roy Yuen
Design, debug, and optimize production RAG systems with expert architecture, hybrid search, and grounding strategies.
$5
· or 25 creditsSecure checkout via Stripe
Included in download
- Construct hybrid search pipelines combining semantic and keyword retrieval
- Debug hallucination risks by implementing strict source grounding protocols
- Ready for Claude Code
- Includes example output and usage patterns
- Instant install
Sample input
Our RAG system isn't finding specific technical errors like 'error 402' in the logs, even though the docs are indexed. Can you diagnose why retrieval is failing for these identifiers?
Sample output
Diagnosis: Low recall@k. Hypothesis: Missing BM25/keyword search for technical identifiers. Evidence: Search 'error 402' returns generic HTTP docs, not specific logs. Fix: Implement Hybrid Search with RRF + Metadata filters for log levels. Expected Impact: +25% precision on technical queries.
About This Skill
Advanced RAG System Architecture & Debugging
Designing a production-ready Retrieval-Augmented Generation (RAG) system requires more than just a vector database and a prompt. The RAG Architect skill provides a developer-centric framework for building, hardening, and troubleshooting complex retrieval stacks, moving beyond generic implementations to high-performance architecture.
What it does
This skill acts as a senior systems architect for your AI pipeline. It analyzes ingestion workflows, document parsing, chunking strategies, embedding selection, and vector store performance. Whether you are building from scratch or fixing a broken implementation, it applies a rigorous, evidence-based methodology to ensure your agent stays grounded and accurate.
Supported Capabilities
- Architecture Design: Decisions for hybrid search, reranking, and context packing tailored to your specific corpus (Legal, Code, Product Docs, etc.).
- Truth-First Debugging: Systematic isolation of failures across the pipeline—from bad parsing to stale indexes and tenant leakage.
- Infrastructure Selection: Unbiased tradeoff analysis for vector databases (pgvector, Qdrant, Milvus), embedding models, and rerankers.
- Production Hardening: Implementing multi-tenant isolation, citation grounding, and incremental re-indexing strategies.
- Evaluation Frameworks: Establishing metrics for recall@k, precision, and faithfulness to ensure changes are data-driven rather than anecdotal.
Why use this skill?
Standard LLM prompts often treat "bad answers" as model hallucinations. This skill identifies when the problem is actually a metadata filter mismatch, poor chunking semantics, or an inefficient reranker. It helps you reduce latency and cost by optimizing the weakest stage of your pipeline rather than over-relying on expensive long-context windows.
Use Cases
- Construct hybrid search pipelines combining semantic and keyword retrieval
- Debug hallucination risks by implementing strict source grounding protocols
- Optimize indexing strategies for low-latency document retrieval at scale
- Architect multi-stage re-ranking workflows to improve answer precision
Known Limitations
- Cannot perform the actual vector DB migration or infrastructure provisioning.
- Effectiveness is limited without access to specific log samples or retrieval metrics.
- Does not generate frontend UI.
How to Install
mkdir -p ~/.claude/skills && curl -sL https://www.agensi.io/api/install/rag-architect -o /tmp/rag-architect.zip && unzip -o /tmp/rag-architect.zip -d ~/.claude/skills && rm /tmp/rag-architect.zipFree skills install directly. Paid skills require purchase - use the download button above after buying.
Reviews
No reviews yet - be the first to share your experience.
Only users who have downloaded or purchased this skill can leave a review.
Early access skill
Be the first to review this skill.
Only users who have downloaded or purchased this skill can leave a review.
Security Scanned
Passed automated security review
Permissions
No special permissions declared or detected
Claude Code, GitHub Copilot Extensions, Cursor, and SKILL.md-compatible agents.
Frequently Asked Questions
Learn More About AI Agent Skills
More Premium Skills
designing-hybrid-context-layers
Architects the right retrieval strategy for every query — teaching your agent when to use RAG, a knowledge graph, or a temporal index instead of defaulting to vector search for everything.
diagnosing-rag-failure-modes
RAG fails quietly. It retrieves documents, returns confident-looking answers, and misses the question entirely — because the question required connecting facts across documents, reasoning about sequence, or tracing causation. This skill gives you a five-question diagnostic checklist that classifies any failing query as either RAG-safe or structurally RAG-incompatible, then maps it to the specific failure pattern and the architectural fix that resolves it.
synthesizing-institutional-knowledge
Builds the organizational memory schema your AI agent needs to answer why — capturing decision provenance, causal chains, and event context that embedding-based retrieval permanently discards.
ai-automation-qa-pack
Professional QA & UAT documentation generator for AI automation agencies and complex agent deployments.