local-llm-troubleshooter
by LB Creations
Diagnose and fix broken local LLM stacks, GPU issues, and stalled model downloads across Ollama, LM Studio, and more.
- Identify why a local inference server is unresponsive or timing out
- Debug CUDA Out-of-Memory (OOM) errors and GPU-to-CPU fallback issues
- Detect and resume stalled Hugging Face or Ollama model downloads
Free
Included in download
- Downloadable skill package
- Works with Claude Code, Cursor
- 1 permission declared
Sample input
My Ollama server is responding but I can't get Llama 3 70b to load, it just hangs. Can you run the doctor script and tell me why it's stuck?
Sample output
Verdict: STUCK (Ollama) Cause: CUDA Out of Memory (OOM) detected in logs. Model: Llama-3-70b (Q4_K_M) Fix:
- Set OLLAMA_NUM_PARALLEL=1
- Reduce context window from 8k to 4k in Modelfile.
- If issue persists, switch to Q2 quantization. Verification: Re-run diagnostic to confirm GPU runner load.
local-llm-troubleshooter
by LB Creations
Diagnose and fix broken local LLM stacks, GPU issues, and stalled model downloads across Ollama, LM Studio, and more.
Free
Included in download
- Downloadable skill package
- Works with Claude Code, Cursor
- 1 permission declared
- Instant install
Sample input
My Ollama server is responding but I can't get Llama 3 70b to load, it just hangs. Can you run the doctor script and tell me why it's stuck?
Sample output
Verdict: STUCK (Ollama) Cause: CUDA Out of Memory (OOM) detected in logs. Model: Llama-3-70b (Q4_K_M) Fix:
- Set OLLAMA_NUM_PARALLEL=1
- Reduce context window from 8k to 4k in Modelfile.
- If issue persists, switch to Q2 quantization. Verification: Re-run diagnostic to confirm GPU runner load.
Screenshots
About This Skill
What it does
The Local LLM Troubleshooter is a diagnostic power-tool for developers and AI engineers whose local inference stacks (Ollama, LM Studio, llama.cpp, vLLM, or Hugging Face) are failing. It eliminates the guesswork of "why is my model slow?" or "why won't this load?" by running a bundled diagnostic script that probes connection states, scans logs for failure signatures, and detects stalled downloads.
Why use this skill
Prompting a generic AI about local hardware issues often leads to circular advice. This skill is better because it uses llm_doctor.py to act as a system sensor. It identifies specific technical blockers like GGUF version mismatches, CUDA OOM (Out of Memory) errors, port conflicts, runner crashes, and stalled Hugging Face blobs. It maps these findings to a curated playbook of OS-specific fixes for Apple Silicon, NVIDIA, and WSL2 environments.
Supported tools
- Inference Servers: Ollama, LM Studio, llama.cpp, vLLM
- Model Sources: Hugging Face (hub downloads), Ollama library
- Frameworks: GGUF, local runners, GPU-accelerated backends
What the output looks like
The skill provides a structured triage report including a connectivity verdict (up/down/stuck), identification of the specific bottleneck, and an ordered list of high-probability fixes—ranging from context window adjustments to environment variable corrections.
Use Cases
- Identify why a local inference server is unresponsive or timing out
- Debug CUDA Out-of-Memory (OOM) errors and GPU-to-CPU fallback issues
- Detect and resume stalled Hugging Face or Ollama model downloads
- Resolve port conflicts and runner crashes in LM Studio or llama.cpp
Known Limitations
- Cannot auto-apply fixes; user must run commands manually.
- Requires Python 3 to run the diagnostic script.
- Limited visibility into proprietary, closed-source inference engines.
How to Install
mkdir -p ~/.claude/skills && curl -sL https://www.agensi.io/api/install/local-llm-troubleshooter -o /tmp/local-llm-troubleshooter.zip && unzip -o /tmp/local-llm-troubleshooter.zip -d ~/.claude/skills && rm /tmp/local-llm-troubleshooter.zipFree skills install directly. Paid skills require purchase - use the download button above after buying.
Reviews
No reviews yet - be the first to share your experience.
Only users who have downloaded or purchased this skill can leave a review.
No reviews yet - be the first to share your experience.
Only users who have downloaded or purchased this skill can leave a review.
Security Scanned
Passed automated security review
Permissions
File Scopes
Claude Code, Cursor, Windsurf, and other SKILL.md-compatible agents.
Creator
LB designs and builds autonomous AI systems optimized for local deployment. Specializing in distributed inference fleets, multi-model orchestration, and agent-native tooling, everything runs on your hardware, zero API fees.
Frequently Asked Questions
Learn More About AI Agent Skills
More Premium Skills
Multi-Agent Orchestration Master Library
Transform Claude Code into a coordinated multi-agent system. Battle-tested tmux orchestration patterns, YAML task queues, event-driven communication, and parallel worker management for 8+ agents.
designing-hybrid-context-layers
Architects the right retrieval strategy for every query — teaching your agent when to use RAG, a knowledge graph, or a temporal index instead of defaulting to vector search for everything.
ai-automation-qa-pack
Professional QA & UAT documentation generator for AI automation agencies and complex agent deployments.
Bounty Security Pattern Master Library — 399 Vulnerability Patterns
A premium library of 399 vulnerability patterns and DeFi attack vectors for AI-driven bug hunting and security audits.