Codebase Archaeology
by Kaymue
Reverse-engineer unfamiliar code. Dependency map, dead code, risk hotspots, onboarding guide. Survive inheriting 200k lines.
- The "I just inherited 200k lines of code from someone who left" survival kit. Reverse-engineers any codebase into dependency maps, dead-code reports, risk hotspots, and an onboarding guide. Saves 2-3 weeks of tribal-knowledge gathering.
Free
Codebase Archaeology
by Kaymue
Reverse-engineer unfamiliar code. Dependency map, dead code, risk hotspots, onboarding guide. Survive inheriting 200k lines.
Free
Included in download
- Downloadable skill package
- 1 permission declared
- Instant install
About This Skill
# Codebase Archaeology You just inherited 200,000 lines of code. The author left. There are no docs. The CEO wants a new feature by Friday. This skill turns "where do I even start" into a structured 2-week onboarding plan. ## What it does A systematic reverse-engineering workflow for any codebase: - **Dependency map** — module graph, circular deps, fan-in / fan-out metrics - **Dead code report** — unused exports, unreachable functions, orphaned files - **Risk hotspots** — files that change often + are complex + lack tests - **Conventions detector** — what naming/structure does this codebase actually use - **Hidden entry points** — scripts, cron jobs, CLI tools, undocumented APIs - **Onboarding guide generator** — README, ARCHITECTURE.md, CONCEPTS.md - **"Where do I change X"** — for a feature request, identifies all touchpoints ## When to use it - You just joined a team and need to ramp up fast - You inherited a legacy codebase with no docs - You need to estimate the cost of a refactor - You want to find dead code to delete (or test gaps to fill) - You need to onboard a new hire - You're auditing a codebase before acquisition ## Why it's better than ad-hoc prompting Most "explain this codebase" prompts produce surface-level summaries. This skill is different: - **Quantitative** — every module gets a score (complexity, coupling, churn) - **Actionable** — outputs a prioritized 2-week plan, not just docs - **Visual** — generates interactive dependency graphs (Mermaid) - **Comprehensive** — covers 12 dimensions, not just "what does it do" - **Cumulative** — second run shows what's changed since first ## Architecture ``` ┌─────────────────────────────────────────────────────────┐ │ Agent (Claude/Cursor) │ │ - Points at a codebase │ │ - Runs archaeology scripts │ │ - Synthesizes findings + onboarding plan │ └───────────────┬─────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────┐ │ skills/codebase-archaeology/ │ │ scripts/ │ │ ├── dependency_map.py # Import graph + cycles │ │ ├── dead_code.py # Unused exports, funcs │ │ ├── hotspots.py # Churn × complexity │ │ ├── conventions.py # Style + pattern detection │ │ ├── entry_points.py # Scripts, cron, CLI │ │ ├── onboarding_gen.py # README, ARCHITECTURE │ │ └── feature_locator.py # "Where do I add X?" │ │ references/ │ │ ├── onboarding-plan.md │ │ ├── hotspot-playbook.md │ │ └── dead-code-policy.md │ │ templates/ │ │ ├── ARCHITECTURE.md.tmpl │ │ └── CONCEPTS.md.tmpl │ └─────────────────────────────────────────────────────────┘ ``` ## Quick start ```bash # 1. Install pip install networkx radon lizard pydeps mccabe # 2. Generate dependency map python scripts/dependency_map.py ./src --format mermaid > docs/architecture.md # 3. Find dead code (Python) python scripts/dead_code.py ./src --language python # 4. Risk hotspots python scripts/hotspots.py ./src --since "1 year ago" # 5. Detect conventions python scripts/conventions.py ./src # 6. Find entry points python scripts/entry_points.py . # 7. Generate onboarding guide python scripts/onboarding_gen.py ./src --output docs/ # 8. "Where do I add a new feature?" python scripts/feature_locator.py ./src "user authentication" ``` ## Sample onboarding output (excerpt) ``` # Codebase Onboarding Plan — 2 weeks ## Day 1-2: Reconnaissance - [ ] Read README.md (auto-generated) - [ ] Review ARCHITECTURE.md (auto-generated) — focus on: - Module structure (3 layers: api → service → data) - 3 main domains: users, billing, reports - [ ] Skim 5 most-imported files (top of dependency map) - [ ] Run the test suite once to know the baseline ## Day 3-4: Hotspot familiarization - [ ] Open top 5 hotspot files (most changed + most complex) - [ ] Read their tests — they encode the team's expectations - [ ] Note the 3 "load-bearing" modules (high fan-in, low churn) ## Day 5-7: Make your first change (in test branch) - [ ] Add a feature in the simplest module - [ ] Run lints, tests, type checks - [ ] Open a PR — observe review feedback patterns - [ ] Update ARCHITECTURE.md with what you learned ## Day 8-10: Tackle a small bug - [ ] Pick a low-priority issue - [ ] Use feature_locator.py to find touchpoints - [ ] Make the fix, add a regression test - [ ] Note any "weird" code that needs explaining ## Day 11-14: Write your "I just joined" doc - [ ] 3 things that surprised you - [ ] 3 things that are broken-but-intentional - [ ] 3 things you'd refactor given time - [ ] Add to CONCEPTS.md (auto-updated each run) ``` ## The 12 dimensions analyzed 1. **Module structure** — top-level layout, layer count, domain boundaries 2. **Dependency graph** — module imports, cycles, fan-in/fan-out 3. **Dead code** — unused exports, unreachable functions, orphan files 4. **Risk hotspots** — files with high churn AND high complexity 5. **Test coverage** — line + branch, gap analysis 6. **Style conventions** — naming, formatting, file structure 7. **Error handling** — exception patterns, error codes, retry logic 8. **Concurrency model** — threads, async, locks, actors 9. **External integrations** — APIs, DBs, queues, third-party libs 10. **Configuration** — env vars, config files, secrets 11. **Entry points** — main(), CLIs, cron, message handlers, webhooks 12. **Documentation gaps** — public functions without doc comments ## Pricing Single-purchase, lifetime access. $9.00. Includes: - 7 Python archaeology scripts - 3 reference docs (onboarding plan, hotspot playbook, dead-code policy) - 2 templates (ARCHITECTURE.md, CONCEPTS.md) - Sample analysis of a real open-source project - Future updates for the same major version ## Example usage > "I'm joining a team next week. They have a 200k line Python/TypeScript monorepo with no docs. Give me a 2-week onboarding plan." The skill will: 1. Run all 12 dimensions 2. Generate ARCHITECTURE.md from real data (not vibes) 3. Identify the 5 "must-understand" files 4. Output a day-by-day plan 5. Save findings to `docs/` for future team members ## Compatibility Works with any agent that supports the SKILL.md standard and can execute Python: Claude Code, OpenClaw, Codex CLI, Cursor, Gemini CLI, Cline, Windsurf, Aider. Supports Python, TypeScript, Go (full); Rust, Java (partial). Requires Git for hotspot/churn analysis. Tested on Linux, macOS, Windows. ## Tags code-analysis, refactoring, documentation, onboarding, technical-debt, legacy, code-quality
Use Cases
- The "I just inherited 200k lines of code from someone who left" survival kit. Reverse-engineers any codebase into dependency maps, dead-code reports, risk hotspots, and an onboarding guide. Saves 2-3 weeks of tribal-knowledge gathering.
How to Install
mkdir -p ~/.claude/skills && curl -sL https://www.agensi.io/api/install/codebase-archaeology -o /tmp/codebase-archaeology.zip && unzip -o /tmp/codebase-archaeology.zip -d ~/.claude/skills && rm /tmp/codebase-archaeology.zipFree skills install directly. Paid skills require purchase - use the download button above after buying.
Reviews
No reviews yet - be the first to share your experience.
Only users who have downloaded or purchased this skill can leave a review.
No reviews yet - be the first to share your experience.
Only users who have downloaded or purchased this skill can leave a review.
Security Scanned
Passed automated security review
Permissions
File Scopes
Works with any agent that supports the universal SKILL.md standard
Creator
Frequently Asked Questions
Learn More About AI Agent Skills
More Premium Skills
designing-hybrid-context-layers
Architects the right retrieval strategy for every query — teaching your agent when to use RAG, a knowledge graph, or a temporal index instead of defaulting to vector search for everything.
ai-automation-qa-pack
Professional QA & UAT documentation generator for AI automation agencies and complex agent deployments.
Bounty Security Pattern Master Library — 399 Vulnerability Patterns
A premium library of 399 vulnerability patterns and DeFi attack vectors for AI-driven bug hunting and security audits.
Multi-Agent Orchestration Master Library
Transform Claude Code into a coordinated multi-agent system. Battle-tested tmux orchestration patterns, YAML task queues, event-driven communication, and parallel worker management for 8+ agents.