💸 LLM Cost Leak Linter
Find the code that quietly runs up your LLM bill: uncapped generations, model calls looping with no limit, an expensive model doing a trivial job, and whole files stuffed into prompts. Read-only, every finding explained in plain English.
- Detect uncapped LLM generations in loops and retries.
- Identify expensive models used for high-frequency trivial tasks.
- Spot missing truncation in prompt assembly logic.
$12
· or 60 creditsSecure checkout via Stripe
Included in download
- Detect uncapped LLM generations in loops and retries.
- Identify expensive models used for high-frequency trivial tasks.
- file_read automation included
- Ready for Cursor
Media gallery
See it in action
You say
Scan the src directory for potential LLM cost leaks and tell me if I am wasting tokens.
Your agent does
Found 3 cost leaks:
- LCL002 (High): Loop in
chains.py:42calls gpt-4-turbo without a max_retries limit. - LCL005 (Med):
loaders.ts:12passes raw file buffer to prompt without truncation. - LCL003 (Low): Using gpt-4 for sentiment analysis on line 88; consider gpt-3.5-turbo.
💸 LLM Cost Leak Linter
Find the code that quietly runs up your LLM bill: uncapped generations, model calls looping with no limit, an expensive model doing a trivial job, and whole files stuffed into prompts. Read-only, every finding explained in plain English.
$12
· or 60 creditsSecure checkout via Stripe
Included in download
- Detect uncapped LLM generations in loops and retries.
- Identify expensive models used for high-frequency trivial tasks.
- file_read automation included
- Ready for Cursor
- Instant install
Media gallery
See it in action
You say
Scan the src directory for potential LLM cost leaks and tell me if I am wasting tokens.
Your agent does
Found 3 cost leaks:
- LCL002 (High): Loop in
chains.py:42calls gpt-4-turbo without a max_retries limit. - LCL005 (Med):
loaders.ts:12passes raw file buffer to prompt without truncation. - LCL003 (Low): Using gpt-4 for sentiment analysis on line 88; consider gpt-3.5-turbo.
About This Skill
The problem
LLM application code often contains hidden patterns that drain budgets before billing alerts trigger. Developers lack automated ways to detect uncapped generations, expensive models used for trivial tasks, or inefficient prompting patterns during development.
What it does
- Scans source files for LLM calls lacking output caps or truncation logic.
- Identifies model calls inside loops or retry blocks without attempt limits.
- Flags expensive models used for high-frequency or simple tasks based on
model-pricing.json. - Detects instances where entire files are injected into prompts without preprocessing.
- Points out missing token usage logging and cost instrumentation.
Frameworks & tools
Supports Python, JavaScript, TypeScript, and React (JSX/TSX). Works with any LLM provider but requires manual updates to pricing reference files.
Why this beats prompting it yourself
Generic LLMs often miss specific architectural cost leaks like loop retry logic or missing backoff limits. This skill provides a structured heuristic scan that maps code patterns to specific cost-risk IDs rather than offering vague advice.
Use cases
- Pre-production audit of AI features to prevent runaway costs.
- Reviewing legacy LLM implementations for potential savings.
- Standardizing cost-conscious coding practices across a team.
Known limitations
This is a heuristic linter, not a runtime monitor. It identifies risky patterns but does not track real-world token usage or live expenditure.
Use Cases
- Detect uncapped LLM generations in loops and retries.
- Identify expensive models used for high-frequency trivial tasks.
- Spot missing truncation in prompt assembly logic.
- Audit code for missing token and cost logging.
Known Limitations
Heuristic. It flags cost-risk patterns for you to review. It does not measure your actual spend, count live tokens, or run your code, and it cannot confirm a fix works end to end. The expensive-model and cache checks are advisory. The model tiers and prices are only as current as the editable model-pricing.json you maintain. It reads code and structure, not runtime behavior.
How to Install
mkdir -p ~/.claude/skills && curl -sL https://www.agensi.io/api/install/llm-cost-leak-linter -o /tmp/llm-cost-leak-linter.zip && unzip -o /tmp/llm-cost-leak-linter.zip -d ~/.claude/skills && rm /tmp/llm-cost-leak-linter.zipFree skills install directly. Paid skills require purchase - use the download button above after buying.
Reviews
No reviews yet - be the first to share your experience.
Only users who have downloaded or purchased this skill can leave a review.
Early access skill
Be the first to review this skill.
Only users who have downloaded or purchased this skill can leave a review.
Security Scanned
Passed automated security review
Permissions
Allowed Hosts
File Scopes
This skill needs Read Files only. It reads your source files to scan for cost-leak patterns, and reads its own references/model-pricing.json for the model tiers. Findings are returned as text in the conversation. It does not write to disk, run shell commands, make network calls, or read environment variables or secrets.
Runs anywhere a coding agent can execute a Python 3 script. Tested with Claude Code, Cursor, Codex CLI, Windsurf, and Cline. Python 3 standard library only: no third-party packages, no network calls, and it never runs your code. Scans Python, JavaScript, TypeScript, and JSX/TSX source. Update references/model-pricing.json to match your provider's current prices and model tiers.