2

    rag-architect

    by Roy Yuen

    Design, debug, and optimize production RAG systems with expert architecture, hybrid search, and grounding strategies.

    Updated May 2026
    114 views
    Security scanned

    $5

    One-time purchase · Own forever

    ⚡ Also available via Agensi MCP — your AI agent can load this skill on demand via MCP. Learn more →

    Included in download

    • Construct hybrid search pipelines combining semantic and keyword retrieval
    • Debug hallucination risks by implementing strict source grounding protocols
    • Ready for Claude Code
    • Includes example output and usage patterns
    • Instant install

    See it in action

    Diagnosis: Low recall@k.
    Hypothesis: Missing BM25/keyword search for technical identifiers.
    Evidence: Search 'error 402' returns generic HTTP docs, not specific logs. 
    Fix: Implement Hybrid Search with RRF + Metadata filters for log levels.
    Expected Impact: +25% precision on technical queries.

    About This Skill

    Advanced RAG System Architecture & Debugging

    Designing a production-ready Retrieval-Augmented Generation (RAG) system requires more than just a vector database and a prompt. The RAG Architect skill provides a developer-centric framework for building, hardening, and troubleshooting complex retrieval stacks, moving beyond generic implementations to high-performance architecture.

    What it does

    This skill acts as a senior systems architect for your AI pipeline. It analyzes ingestion workflows, document parsing, chunking strategies, embedding selection, and vector store performance. Whether you are building from scratch or fixing a broken implementation, it applies a rigorous, evidence-based methodology to ensure your agent stays grounded and accurate.

    Supported Capabilities

    • Architecture Design: Decisions for hybrid search, reranking, and context packing tailored to your specific corpus (Legal, Code, Product Docs, etc.).
    • Truth-First Debugging: Systematic isolation of failures across the pipeline—from bad parsing to stale indexes and tenant leakage.
    • Infrastructure Selection: Unbiased tradeoff analysis for vector databases (pgvector, Qdrant, Milvus), embedding models, and rerankers.
    • Production Hardening: Implementing multi-tenant isolation, citation grounding, and incremental re-indexing strategies.
    • Evaluation Frameworks: Establishing metrics for recall@k, precision, and faithfulness to ensure changes are data-driven rather than anecdotal.

    Why use this skill?

    Standard LLM prompts often treat "bad answers" as model hallucinations. This skill identifies when the problem is actually a metadata filter mismatch, poor chunking semantics, or an inefficient reranker. It helps you reduce latency and cost by optimizing the weakest stage of your pipeline rather than over-relying on expensive long-context windows.

    Use Cases

    • Construct hybrid search pipelines combining semantic and keyword retrieval
    • Debug hallucination risks by implementing strict source grounding protocols
    • Optimize indexing strategies for low-latency document retrieval at scale
    • Architect multi-stage re-ranking workflows to improve answer precision

    Reviews

    No reviews yet — be the first to share your experience.

    Only users who have downloaded or purchased this skill can leave a review.

    Security Scanned

    Passed automated security review

    Permissions

    No special permissions declared or detected

    Claude Code, GitHub Copilot Extensions, Cursor, and SKILL.md-compatible agents.

    Creator

    Frequently Asked Questions

    Similar Skills

    $5

    One-time