OllamaWatch — Local LLM Health Monitor + Telegram Alerts
Monitor your local Ollama server and get Telegram alerts when models crash, the GPU runs out of memory, or downloads stall.
- Monitor AI server health 24/7 without manual log checking
- Receive Telegram alerts for CUDA Out-of-Memory (OOM) errors
- Detect stalled model downloads automatically in the background
$7
· or 35 creditsSecure checkout via Stripe
Included in download
- Monitor AI server health 24/7 without manual log checking
- Receive Telegram alerts for CUDA Out-of-Memory (OOM) errors
- terminal, file_read, file_write automation included
- Ready for Intel
Sample input
My Ollama server keeps going down overnight and I don't know why. Run OllamaWatch and set up Telegram alerts so I get notified the moment it goes down or recovers.
Sample output
🔴 OLLAMA DOWN http://localhost:11434 Error: Connection refused (is Ollama running?) 💡 Ollama is not running. Start with: `ollama serve` (or check the systemd/service). 2026-06-07 21:42:18 UTC --- ✅ OLLAMA RECOVERED http://localhost:11434 Latency: 142 ms Loaded: llama3.1:8b-instruct-q4_K_M 2026-06-07 21:47:33 UTC
OllamaWatch — Local LLM Health Monitor + Telegram Alerts
Monitor your local Ollama server and get Telegram alerts when models crash, the GPU runs out of memory, or downloads stall.
$7
· or 35 creditsSecure checkout via Stripe
Included in download
- Monitor AI server health 24/7 without manual log checking
- Receive Telegram alerts for CUDA Out-of-Memory (OOM) errors
- terminal, file_read, file_write automation included
- Ready for Intel
- Instant install
Sample input
My Ollama server keeps going down overnight and I don't know why. Run OllamaWatch and set up Telegram alerts so I get notified the moment it goes down or recovers.
Sample output
🔴 OLLAMA DOWN http://localhost:11434 Error: Connection refused (is Ollama running?) 💡 Ollama is not running. Start with: `ollama serve` (or check the systemd/service). 2026-06-07 21:42:18 UTC --- ✅ OLLAMA RECOVERED http://localhost:11434 Latency: 142 ms Loaded: llama3.1:8b-instruct-q4_K_M 2026-06-07 21:47:33 UTC
About This Skill
Stop babysitting your Ollama server. Get a Telegram alert the moment a model crashes, the GPU runs out of memory, or the API stops responding. Built for homelabbers and self-hosters running Ollama 24/7 who don't want to pay for SaaS monitoring or refresh dashboards manually. Features: - DOWN alerts (🔴): instant notification when the API is unreachable, hung, or returning errors - DEGRADED alerts (⚠️): CUDA OOM, runner crashes, log errors, GPU memory above 95% - RECOVERED alerts (✅): service is back, includes latency and currently loaded models - Smart alerting: fires only on state transitions, no spam - Built-in fix hints: every alert includes a suggested remediation - NVIDIA GPU monitoring via nvidia-smi (gracefully skipped on AMD/Intel/CPU-only setups) - Cross-platform: Linux, macOS, Windows (auto-detects log paths) - Single-file Python script: no Docker, no Node.js, no SaaS subscription - MIT licensed Install: pip install -r requirements.txt python3 ollamawatch.py check # one-shot health check python3 ollamawatch.py watch # continuous monitoring Requires: Python 3.8+, Ollama (local or remote), free Telegram bot (5-min setup). Use cases: - Monitor a remote Ollama box you cannot see (headless homelab, VPS) - Get woken up if an overnight batch job hangs the GPU - Catch memory leaks before they take down your inference stack - Detect when a model pull has stalled mid-download - Replace expensive SaaS monitoring ($0/month vs $30+/month)
Use Cases
- Monitor AI server health 24/7 without manual log checking
- Receive Telegram alerts for CUDA Out-of-Memory (OOM) errors
- Detect stalled model downloads automatically in the background
- Check VRAM availability before loading large model weights
- Monitor a remote Ollama box you cannot see (headless homelab or VPS)
- Get woken up if an overnight batch job hangs the GPU
- Detect when a model pull has stalled mid-download
- Replace expensive SaaS monitoring with a $0 setup
- Catch memory leaks before they take down your inference stack
Known Limitations
- NVIDIA GPU only for memory monitoring; AMD/Intel/CPU work but skip GPU checks - Log scanning requires the Ollama log file in a readable location (configurable) - State held in memory; restart resets alert deduplication - Watch mode minimum interval is 30 seconds
How to Install
mkdir -p ~/.claude/skills && curl -sL https://www.agensi.io/api/install/ollamawatch-local-llm-health-monitor-telegram-alerts -o /tmp/ollamawatch-local-llm-health-monitor-telegram-alerts.zip && unzip -o /tmp/ollamawatch-local-llm-health-monitor-telegram-alerts.zip -d ~/.claude/skills && rm /tmp/ollamawatch-local-llm-health-monitor-telegram-alerts.zipFree skills install directly. Paid skills require purchase - use the download button above after buying.
Reviews
No reviews yet - be the first to share your experience.
Only users who have downloaded or purchased this skill can leave a review.
Be the first to review this skill.
Only users who have downloaded or purchased this skill can leave a review.
Security Scanned
Passed automated security review
Permissions
Allowed Hosts
File Scopes
Terminal: required to run nvidia-smi for GPU memory monitoring (read-only, fixed command, no shell expansion). Read/Write Files: config.json, .env, state file, log tailing. Network: outbound to Ollama API (default localhost:11434) and api.telegram.org for alerts only. Environment Variables: reads TELEGRAM_BOT_TOKEN and TELEGRAM_CHAT_ID from .env — these two variables only.
Requires Python 3.8+ and an Ollama instance (local or remote). NVIDIA GPU monitoring is optional and degrades gracefully on AMD/Intel/CPU-only setups. Cross-platform: Linux, macOS, Windows.